Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of OPTRAM algorithm to derive soil moisture from remote sensing imagery #612

Open
14 of 29 tasks
micha-silver opened this issue Oct 12, 2023 · 97 comments
Open
14 of 29 tasks

Comments

@micha-silver
Copy link

micha-silver commented Oct 12, 2023

Submitting Author Name: Micha Silver
Submitting Author Github Handle: @micha-silver
Other Package Authors Github handles: (comma separated, delete if none) @github_handle1, @github_handle2
Repository: https://gitlab.com/rsl-bidr/roptram
Version submitted: 0.0.1.000
Submission type: Standard
Editor: @adamhsparks
Reviewers: @harryeslick, @obrl-soil

Archive: TBD
Version accepted: TBD
Language: en


  • Paste the full DESCRIPTION file inside a code block below:
Package: rOPTRAM
Title: Derive soil moisture using the OPTRAM algorithm
Version: 0.0.1.000
Authors@R: c(
  person("Micha", "Silver", , "silverm@post.bgu.ac.il", role = c("aut", "cre"),
           comment = c(ORCID = "0000-0002-1128-1325")),
  person("Arnon", "Karnieli", , "karnieli@bgu.ac.il", role = c("ctb", "fnd"),
          comment = c(ORCID = "0000-0001-8065-9793"))
  )
Description: The OPtical TRapezoid Model (OPTRAM) derives soil moisture
  based on the linear relation between a vegetation index
  and Land Surface Temperature (LST). The Short Wave Infra-red (SWIR) band
  is used as a proxy for LST. See:
  Sadeghi, M., Babaeian, E., Tuller, M., Jones, S.B., 2017.
  The optical trapezoid model:
  A novel approach to remote sensing of soil moisture 
  applied to Sentinel-2 and Landsat-8 observations.
  Remote Sensing of Environment 198, 52–68,
  https://doi.org/10.1016/j.rse.2017.05.041 .
License: GPL (>= 3) + file LICENSE
URL: https://gitlab.com/rsl-bidr/roptram.git
BugReports: https://gitlab.com/rsl-bidr/
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
Depends:
  R (>= 4.1.0)
Imports:
    dplyr,
    ggplot2,
    sf,
    terra,
    tools,
    utils
Suggests:
    geojsonio,
    geojsonlint,
    stats,
    sen2r (> 1.5.0),
    testthat (>= 3.0.0),
    xml2
Config/testthat/edition: 3

Scope

  • Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):

    • data retrieval
    • data extraction
    • data munging
    • data deposition
    • data validation and testing
    • workflow automation
    • version control
    • citation management and bibliometrics
    • scientific software wrappers
    • field and lab reproducibility tools
    • database software bindings
    • geospatial data
    • text analysis
  • Explain how and why the package falls under these categories (briefly, 1-2 sentences):
    This package includes acquiring satellite imagery, and preparing spatially explicit soil moisture raster grids.

  • Who is the target audience and what are scientific applications of this package?
    Researchers in ecology, agriculture, sustainability. Agricultural management of grazing lands, reforestration.

  • Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?
    No

  • (If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research?
    Yes

  • If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

  • Explain reasons for any pkgcheck items which your package is unable to pass.

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

  • Do you intend for this package to go on CRAN?

  • Do you intend for this package to go on Bioconductor?

  • Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:

MEE Options
  • The package is novel and will be of interest to the broad readership of the journal.
  • The manuscript describing the package is no longer than 3000 words.
  • You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
  • (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
  • (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
  • (Please do not submit your package separately to Methods in Ecology and Evolution)

Code of conduct

@ropensci-review-bot
Copy link
Collaborator

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

@ropensci-review-bot
Copy link
Collaborator

Error (500). The editorcheck service is currently unavailable

@ropensci ropensci deleted a comment from ropensci-review-bot Oct 13, 2023
@ropensci ropensci deleted a comment from ropensci-review-bot Oct 13, 2023
@maelle
Copy link
Member

maelle commented Oct 13, 2023

@ropensci-review-bot check package

@ropensci-review-bot
Copy link
Collaborator

Thanks, about to send the query.

@ropensci-review-bot
Copy link
Collaborator

Error (500). The editorcheck service is currently unavailable

@maelle
Copy link
Member

maelle commented Oct 13, 2023

@micha-silver thanks for your submission! We'll fix this glitch in our system next week.

@micha-silver
Copy link
Author

Hi @maelle : It's good to hear from you. I'll look forward to getting the review underway next week.

@noamross
Copy link
Contributor

Thanks @micha-silver for your submission. As Maëlle has noted our automatic package checker has a glitch, one that we think is a GitLab-related edge case. (We have fewer - totally fine! - submissions from GitLab so the system has had less stress-testing against it).

In the meantime, I've determined that your package is in-scope. From an initial manual check of the repository I note a few important things that that will be required before proceeding to assign an editor and reviewers. It may be worth moving ahead on these while we get the automatic checks going:

@ropensci ropensci deleted a comment from ropensci-review-bot Oct 16, 2023
@maelle
Copy link
Member

maelle commented Oct 16, 2023

@ropensci-review-bot check package

@ropensci-review-bot
Copy link
Collaborator

Thanks, about to send the query.

@ropensci-review-bot
Copy link
Collaborator

🚀

Editor check started

👋

@ropensci ropensci deleted a comment from ropensci-review-bot Oct 16, 2023
@ropensci ropensci deleted a comment from ropensci-review-bot Oct 16, 2023
@ropensci-review-bot
Copy link
Collaborator

Checks for rOPTRAM (v0.0.1.000)

git hash: dd810f2f

  • ✔️ Package name is available
  • ✔️ has a 'codemeta.json' file.
  • ✔️ has a 'contributing' file.
  • ✔️ uses 'roxygen2'.
  • ✔️ 'DESCRIPTION' has a URL field.
  • ✔️ 'DESCRIPTION' has a BugReports field.
  • ✔️ Package has at least one HTML vignette
  • ✖️ These functions do not have examples: [aoi_to_name, check_aoi, check_scihub_access].
  • ✖️ Package has no continuous integration checks.
  • ✖️ Package coverage failed
  • ✖️ R CMD check process failed with message: 'Build process failed'.

Important: All failing checks above must be addressed prior to proceeding

Package License: GPL (>= 3) + file LICENSE


1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.

type package ncalls
internal base 125
internal rOPTRAM 20
internal datasets 1
imports terra 25
imports utils 7
imports dplyr 3
imports sf 1
imports ggplot2 NA
imports tools NA
suggests xml2 17
suggests stats 8
suggests sen2r 1
suggests geojsonio NA
suggests testthat NA
suggests knitr NA
suggests rmarkdown NA
linking_to NA NA

Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.

base

file.path (16), lapply (15), c (13), basename (7), paste (7), strsplit (6), unlist (6), data.frame (5), list.files (5), as.data.frame (4), dir (4), nrow (4), grepl (3), sample (3), as.Date (2), do.call (2), length (2), rbind (2), split (2), t (2), as.character (1), dirname (1), gsub (1), log2 (1), max (1), min (1), package_version (1), paste0 (1), path.expand (1), row.names (1), suppressWarnings (1), Sys.which (1), system.file (1), T (1), try (1)

terra

rast (10), as.data.frame (6), writeRaster (4), ext (3), crs (1), project (1)

rOPTRAM

calculate_str (6), calculate_vi (5), aoi_to_name (3), check_scihub_access (2), optram_wetdry_coefficients (2), check_aoi (1), optram_calculate_str (1)

xml2

read_xml (5), xml_find_first (5), xml_text (5), xml_contents (1), xml_find_all (1)

stats

lm (3), quantile (3), offset (1), step (1)

utils

data (3), vi (3), packageVersion (1)

dplyr

full_join (3)

datasets

iris (1)

sen2r

check_gcloud (1)

sf

st_read (1)

NOTE: Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately.


2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

  • code in R (100% in 13 files) and
  • 1 authors
  • 1 vignette
  • no internal data file
  • 6 imported packages
  • 15 exported functions (median 42 lines of code)
  • 21 non-exported functions in R (median 20 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

  • loc = "Lines of Code"
  • fn = "function"
  • exp/not_exp = exported / not exported

All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure value percentile noteworthy
files_R 13 68.2
files_vignettes 2 85.7
files_tests 11 91.7
loc_R 759 60.2
loc_vignettes 142 37.0
loc_tests 206 55.5
num_vignettes 1 64.8
n_fns_r 36 45.9
n_fns_r_exported 15 58.5
n_fns_r_not_exported 21 42.1
n_fns_per_file_r 2 32.7
num_params_per_fn 4 54.6
loc_per_fn_r 30 76.5
loc_per_fn_r_exp 42 74.5
loc_per_fn_r_not_exp 20 63.0
rel_whitespace_R 15 56.6
rel_whitespace_vignettes 23 25.0
rel_whitespace_tests 16 47.0
doclines_per_fn_exp 26 25.3
doclines_per_fn_not_exp 0 0.0 TRUE
fn_call_network_size 20 46.4

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


3. goodpractice and other checks

Details of goodpractice checks (click to open)


3b. goodpractice results

R CMD check with rcmdcheck

R CMD check generated the following error:

  1. Error in proc$get_built_file() : Build process failed

R CMD check generated the following check_fail:

  1. no_import_package_as_a_whole

Test coverage with covr

ERROR: Test Coverage Failed

Cyclocomplexity with cyclocomp

Error : Build failed, unknown error, standard output:

  • checking for file ‘roptram/DESCRIPTION’ ... OK
  • preparing ‘rOPTRAM’:
  • checking DESCRIPTION meta-information ... OK
  • installing the package to build vignettes
    -----------------------------------
  • installing source package ‘rOPTRAM’ ...
    ** using staged installation
    ** R
    ** inst
    ** byte-compile and prepare package for lazy loading
    Error in library(hexSticker) : there is no package called ‘hexSticker’
    Error: unable to load R code in package ‘rOPTRAM’
    Execution halted
    ERROR: lazy loading failed for package ‘rOPTRAM’
  • removing ‘/tmp/RtmpcVV09j/Rinst1adf150625cf/rOPTRAM’
    -----------------------------------
    ERROR: package installation failed

Static code analyses with lintr

lintr found the following 67 potential issues:

message number of times
Avoid 1:length(...) expressions, use seq_len. 2
Avoid library() and require() calls in packages 4
Lines should not be more than 80 characters. 59
Use <-, not =, for assignment. 2


Package Versions

package version
pkgstats 0.1.3.9
pkgcheck 0.1.2.10


Editor-in-Chief Instructions:

Processing may not proceed until the items marked with ✖️ have been resolved.

@micha-silver
Copy link
Author

Thanks:

* We require that all packages have at least on vignette (https://devguide.ropensci.org/building.html#documentation)

I've written and added an appropriate vignette.

* We also require a CONTRIBUTING.md or similar file to provide guidance to potential package reviewers (https://devguide.ropensci.org/collaboration.html?q=contributing.md#contributing-guide)

Added CONTRIBUTING.md (in docs subdir)

* We require testing against the current (release), last (oldrel), and development (devel) versions of R, as well as across linux, macos, and windows platforms (usually just on release).  We also require coverage reporting of testing.  Based on your [.gitlab-ci.yml file](https://gitlab.com/rsl-bidr/roptram/-/blob/main/.gitlab-ci.yml), I think you are only testing on `release`. (https://devguide.ropensci.org/ci.html)

Yes, I have only one CI so far. I'll start preparing tests for additional releases and OS.

@noamross
Copy link
Contributor

Thanks, @micha-silver. Now that we have automated checks, I also see that the repo doesn't show the test coverage (though you may be measuring this locally as I see you have a .covrignore file). Since reviewers may be less familiar with navigation in GitLab, please put badges in your readme to point us to the CI build logs and coverage outputs once those are set up.

@micha-silver
Copy link
Author

Correct, I'm running covr locally.

My coverage is not good, (about 48%) and I'd like some advice here: I have three main functions that pull down multiple Sentinel 2 or Landsat imagery tiles, over a time span.

  • The wrapper function: optram()
  • optram_acquire_s2()
  • optram_landsat

Even if I do a test that limits the time span to one day, it would be > 1GB download for each of these functions. I think that such a big download should not be done in a test. How can I work around this?

@noamross
Copy link
Contributor

This is a common challenge we see. There are several complementary strategies one can take:

  • Write mock tests that test that the functions work with pre-cached data or outputs: https://books.ropensci.org/http-testing/mocking.html
  • Cache data required for testing in the package, or in another public location from which fetching is faster or more accessible. (Such as attached to the GitHub repository as a release, or in a Zenodo record). Pre-process if needed or possible to reduce size.
  • Break up the functions into components that require downloading and those that don't, and write separate tests to cover the latter.
  • Make tests conditional so they don't need to be run all the time, such as making them conditional on setting an environment variable ROPTRAM_ALL_TESTS=1. This would normally be turned off on CI and thus lower coverage would be reported, but that's OK. The test-running procedure should be explained in CONTRIBUTING.md and reviewers will be able to consider it.

@micha-silver
Copy link
Author

@noamross : I'm considering migrating the rOPTRAM repo from gitlab to github. I think I'll be able to configure the CI tests more easily on github (lots more examples...). And since most ROpenSci projects are on github, the review process might also be smoother. Shall I go ahead with this? Anything I need to be aware of??

Regarding your explanation above about testing functions with large downloads, I'd like to go with the 4th option, setting a condition, and skipping those tests (with explanation in CONTRIBUTING).

Thanks, Micha

@noamross
Copy link
Contributor

noamross commented Nov 6, 2023

@micha-silver No problem for us if you move to GitHub, though we are happy to do the process either way. We might have to make a tweak so the bot knows that the repository is in a new location, so let us know when you switch.

@micha-silver
Copy link
Author

Hi:
I've made progress with the rOPTRAM package:

  1. Added examples to all functions
  2. CI tests are now integrated.
  3. covr::package_coverage() now shows:
r$> covr::package_coverage()
rOPTRAM Coverage: 75.49%
R/optram_acquire_s2.R: 17.86%
R/utilities.R: 85.29%
R/optram_wetdry_coefficients.R: 95.10%
R/optram_ndvi_str.R: 95.12%
R/optram_calculate_str.R: 95.45%
R/optram_soilmoisture.R: 100.00%

(TODO: Still needs to be improved. I've added one addition testthat function to test downloading in the optram_acquire_s2.R function by limiting the timespan so that only one sentinel tile is downloaded. But that function still has low coverage. )

  1. I run tests on 5 platforms: Windows release, Windows oldrel, Windows devel, MacOS, and Ubuntu. The gitlab-ci.yml performs these tests by sending to rhub. Here's the summary currently:
                  platform errors warnings notes
       ubuntu-gcc-devel      0        0     0
 windows-x86_64-release      0        0     0
  windows-x86_64-oldrel      0        0     0
   windows-x86_64-devel      0        0     0
          macos_release      0        0    11

[1] "2023-11-26 11:21:16.729992  - Check completed"

Here are links to the results:
for Ubuntu:
for Win release: release
for Win devel:
for Win oldrel:
for Mac macbuilds:

Not sure how long these artifacts are kept on rhub. But the full artifact on gitlab is available:
artifacts on gitlab

What's next?
Best regards, Micha

@micha-silver
Copy link
Author

Following up:
On my local machine I get good coverage:

r$> cvr <- covr::package_coverage()

r$> cvr
rOPTRAM Coverage: 92.70%
R/utilities.R: 85.29%
R/optram_acquire_s2.R: 90.59%
R/optram_wetdry_coefficients.R: 95.10%
R/optram_ndvi_str.R: 95.12%
R/optram_calculate_str.R: 95.45%
R/optram_soilmoisture.R: 100.00%

The explanation: The optram_acquire_s2() function relies on Google Cloud CLI to download Sentinel images. I have that SDK installed locally, but I have not seen any way to install non-interactively, on a rocker image. There seems to be no work around to get access authentication to Google services, and the gsutil utility setup in a CI machine. I tried to copy my Google access token into a CI variable, and use that, but there is no way to get gsutil setup automatically to recognize a username and token. So when I run covr in gitlab-ci it skips the whole download procedure, and shows lower coverage.

@noamross
Copy link
Contributor

Hi @micha-silver, thanks for moving a ahead on this. It's good you got to 75% coverage without the gsutil dependency! I note that https://cloud.google.com/sdk/docs/downloads-interactive#silent to have some guidance for installing on CI services. I'd make good-faith effort with that, but if it is not possible a full description in the install section of the README as well as CONTRIBUTING for those running tests will be sufficient. I note this is one thing that we might ask reviewers to comment on in the next stage - would another dependency be a simpler way to fetch the images?

@micha-silver
Copy link
Author

@noamross I appreciate your encouraging words.

Two comments

  1. The link you suggested to an automated install of the Gcloud CLI works fine. The problem is that after installing you still need to be authenticated. And that, AFAIK, requires an interactive step: Google forces you to go to a certain website, put in credentials and get the OAuth token on your machine. I tried to put a valid token into a gitlab CI variable, then somehow use that to manually recreate the local auth directory (in a docker image) that gcloud requires - no success here :-(. Even if we could find some work around, it would surely be very fragile.
  2. As to a simpler way to fetch the imagery: Indeed Copernicus now (since Oct 2023) offers several new options for downloading - many more than when I began this package. We have recruited another contributor who will help me to weed thru these various options, and implement the best, smoothest ones. If you agree, I'd like to present the package in its current state for review, then add the additional image acquisition capabilities in a future version.

@noamross
Copy link
Contributor

It should be fine to present the package in its current state. Let me re-run checks.

@noamross
Copy link
Contributor

@ropensci-review-bot check package

@ropensci-review-bot
Copy link
Collaborator

Thanks, about to send the query.

@adamhsparks
Copy link
Member

@ropensci-review-bot set due date for @obrl-soil to 2024-03-25

@ropensci-review-bot
Copy link
Collaborator

Review due date for @obrl-soil is now 25-March-2024

@micha-silver
Copy link
Author

micha-silver commented Mar 14, 2024

Hi @adamhsparks
Just a quick update: we've fixed the final two (minor) issues with the MacOS build. Now I'm getting no errors and no warnings on all 5 platforms.
And coverage is at 92% :-)

@micha-silver
Copy link
Author

Hi @adamhsparks
No response yet from either of the reviewers. Is it time for a gentle reminder?

@obrl-soil
Copy link

Hey - sorry but the rewrite pushed this from a quiet time for me to a very busy one! I can get to this over the long weekend.

@harryeslick
Copy link

Hi @micha-silver ,
Im working my way through.
I've logged an issue for you in GitLab: rsl-bidr/roptram#1

@micha-silver
Copy link
Author

@harryeslick
Thanks for finding time for the review.
This issue is fixed in the gitlab repo: 99c04bff.
I improved the check_swir_band function, and also changed the example.

@harryeslick
Copy link

@micha-silver , here is my review,
For the most part it all works quite easily, so well done! my comments as all minor things.

I still need to read up on the Packaging guidelines so Ive not quite completed it yet, so some jobs for me to follow up on as well.

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • Briefly describe any working relationship you have (had) with the package authors.
    • nil
  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need: clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README

some minor suggestions:

OAuth link missing

Running the main wrapper function optram() in the example states that

Registration on Copernicus DataSpace was done in advance, and the OAuth credentials (clientid, and secret) have been saved to the user’s home directory.

I would i like to have a link here to the vignette where setting up OAuth is demonstrated.

OAuth example suggestion

This is a bit of a stylistic preference, and keep in mind I don't fully understand the workflow, so I could be totally wrong here...
roptram/docs/articles/rOPTRAM_acquiring.html#step-4-saving-credentials

Saving credentials section seems a little long winded and unnecessary. I think it could probably be replaced with just

store_cdse_credentials("my_clientID", "My_secret")

the convenience of having a save_creds = TRUE, seems to require more time to explain that it saves in the actual workflow. but I could be wrong.

  • Vignette(s): demonstrating major functionality that runs successfully locally

  • Function Documentation: for all exported functions

Missing function in docs

main page references function optram_soil_moisture()
roptram/docs/index.html#optram_soil_moisture

Is this meant to be optram_calculate_soil_moisture()?

ALSO: optram_calculate_soil_moisture fails with default trapezoid_method
trapezoid_method is called (Line92) before arg.match(trapezoid_method) (Line108)

  • Examples: (that run successfully locally) for all exported functions

I found a few minor bugs here:

acquire_scihub(aoi, from_date, to_date, veg_index = "SAVI")
returns error

SWIR band must be either 11 or 12

optram_acquire_s2

  1. comma missing after parameter remote = "scihub"
  2. SWIR_Band default fails:

optram_calculate_soil_moisture

  1. VI_dir "SAVI" does not exist. Should this be "NDVI"?
  2. data_dir system.file call missing package parameter.
    should be data_dir <- system.file("extdata", package = "rOPTRAM")

optram_calculate_str
returns NULL
data_dir system.file call missing package parameter.
should be: BOA_dir <- system.file("extdata", "BOA", package = "rOPTRAM")

optram_ndvi_str
returns NULL
VI_list system.file call missing package parameter. Also using SAVI instead of NDVI
STR_list system.file call missing package parameter.

optram_wetdry_coefficients
Please check example here. not sure if the plot works...

plot_vi_str_cloud
Please check example here. not sure if the plot works...

retrieve_cdse_credentials
No example provided, but not sure if one is required.
Does this function need to be exported?

store_cdse_credentials
No example provided

  • Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software have been confirmed.
  • Performance: Any performance claims of the software have been confirmed.
  • Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.

Estimated hours spent reviewing: 6

  • Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Other related issues logged at GitLab

rOPTRAM::optram default argument SWIR_band is invalid

issue related to default parameter values logged as issue here: https://gitlab.com/rsl-bidr/roptram/-/issues/1

CDSE credentials saved and retrieved from different file paths

credentials bug logged : https://gitlab.com/rsl-bidr/roptram/-/issues/2

@micha-silver
Copy link
Author

@harryeslick
With the latest commit we have (I think) addressed all the suggestions that you raised. Any further comments are welcome!
(How can I get your email addr to add you as 'rev' to the DESCRIPTION file?)

@obrl-soil
Copy link

Hey, sorry this took so long! Not sure I caught everything but I think I've covered off on the main issues. Tl;dr, good job overall but I think the function design is trying to do a little too much for the end user in places.

Package Review

  • Briefly describe any working relationship you have (had) with the package authors.
  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need: clearly stating problems the software is designed to solve and its target audience in README

  • Installation instructions: for the development version of package and any non-standard dependencies in README

  • Vignette(s): demonstrating major functionality that runs successfully locally

    • see below
  • Function Documentation: for all exported functions

    • see below
  • Examples: (that run successfully locally) for all exported functions

    • see below
  • Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

  • Installation: Installation succeeds as documented.

  • Functionality: Any functional claims of the software have been confirmed.

  • Performance: Any performance claims of the software have been confirmed.

  • Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.

  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.

    • Do you need a CITATION file?
    • The README guideline suggests you may need a few more badges - code coverage, etc
    • Roxygen formatting could be improved in several places (see below)

Estimated hours spent reviewing: 8

  • Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

  • testing against version installed 2024-04-13 on Windows 11, R 4.3.3, RStudio "Ocean Storm" Release (4da58325, 2024-01-28)
  • used QGIS desktop 3.34.2 to check outputs also

General

  • I don't think that plot_vi_str_cloud() should be calling ggsave(); let the user do that (you can demonstrate in an example) or at the very least make it optional. Depending on what the user wants to do with the graphic they'll likely want more control over dimensions, DPI, etc. Also, the way its written this function never sends a plot to the graphics device - the outputs can't be viewed in RStudio at all. This is, I think, not generally expected when plot() is called.
  • I don't think your functions should be reading spatial data in for the users, and I'm specifically focusing on the aoi_file parameter here. It creates a lot of room for weird errors which you'll then be pestered about. Instead, I think it would be far safer to require the user to read in a file using sf and supply the resultant object to aoi_file. This then gives you the ability to write unit tests and error catches to cover e.g. correct spatial type supplied, data projected in an acceptable CRS, non-insane extents, etc. You can't do any of that when all you have to work with is a file path string.

Vignettes

  • remotes::install_gitlab() does not build vignettes by default. You may wish to specify installation with build_vignettes = TRUE in your README to prevent confusion.
  • As mentioned below, "Acquiring Sentinel-2 Imagery: Step 3" will need an update to match the current state of the website.
  • The knitr::include_graphics() calls in Vignette "rOPTRAM: Three Trapezoid Fitting Methods" are failing with a 'can't find file' error - might be related to installing from gitlab though.

Documentation

  • Documentation is good (I especially like how you've specified the data types for each arg) but would benefit from some finessing, namely:

    • Use \code{} rather than backticks for code-formatting
    • Use internal cross-links to other functions like \code{\link[rOPTRAM:optram_aquire_s2]{optram_aquire_s2()}}}
    • Use \url{url} or \href{url}{alt-text} for external URL links
    • Use \doi{} for DOI links
    • Use roxygen list formatting in the more complex 'notes' fields.
    • math formatting is available for formulae with \eqn{} or \deqn{}

see https://roxygen2.r-lib.org/articles/formatting.html .

For specific functions,

aquire_scihub()

  • Most people will be thinking of another website entirely when they see the word scihub 😛 - is there an appropriate alternate name?
  • The instructions in the note for aquire_scihub() seem to be out of date; I think the website has been updated. I had to do the following:
    • Register and log in.
    • On https://dataspace.copernicus.eu/, click on 'my account' in the top right
    • When the account console loads, click on 'Sentinel Hub' in the 'Dashboards' panel
    • Click on user settings in the bottom left of the screen
    • Follow the prompts to create an oauth client as instructed
  • So this took quite a bit of extra clicking around to figure out. Also, on creating the client, there is a checkbox for 'Client will be used by a single-page application'. I assume I don't need to tick this?
  • Once the client was created, I tried to follow the next part of the instructions: "the user must save her secret ...etc" Unfortunately this function is not exported so the user can't run it directly - need to reconsider including this instruction.

exponential_coefficients()

  • missing full stop at end of first sentence in description

optram_landsat()

  • typo in description, 'Use this function to prepares...'

optram_wetdry_coefficients()

  • The Note mentions that "The vegetation index column is named "VI" though it can represent several vegetation indices, such as SAVI, or MSAVI." Would it not be preferable to name it something more generic like 'veg_index'?
  • Speaking of column naming conventions, https://emilyriederer.netlify.app/post/column-name-contracts/ provides some ideas for consistent and informative column names. This is a valuable approach for anyone outputting to multiple on-disk formats, as you are (TIF, CSV, etc)

Examples

  • examples for non-exported functions need to use rOPTRAM::: syntax to work correctly
  • examples using the landsat_dir parameter should all explain what's going on there, currently only optram_landsat() does

aquire_openeo()

  • Example fails as SWIR band not supplied - tested with 11 and then 12.
  • Example prompts in console to copy/paste an ID code that is subsequently not asked for. Not sure if that's an openeo problem? Looks like e.g. " → Copy CTGB-UGFU and paste when requested by the browser"
  • Important! Example code should assign the output to an object, e.g. acq <- aquire_openeo()... otherwise the list of files just gets reported to the console and is otherwise impossible to retrieve without re-running the code.
  • The retrieval process throws an error: "[INFO] 'custom_processes' not loaded: ModuleNotFoundError("No module named 'custom_processes'")." The extraction and processing seemed to run to completion despite this, but I can't be sure as I don't know what success should look like for this function. I have folders in my temp dir for BOA, STR and SAVI containing Geotiffs for 14 days within the range requested but can't tell if they've actually been processed correctly.
  • This function ran long for unknown reasons on a subsequent test - 'job still running' just kept printing and I had to kill the session to continue reviewing.

aquire_scihub()

  • As with aquire_openeo() the SWIR band needs to be specified or checked differently.
  • Although of course you can't include an actual clientid and secret, the example code should maybe look more like:
acq_sh <- acquire_scihub(aoi, from_date, to_date,
            veg_index = "SAVI",
            clientid = 'paste id here',
            secret = 'paste secret here',
            SWIR_band = 11)

to prompt the user directly.

  • This function ran much more quickly than acquire_openeo() and didn't report any errors to the console. I see it masks the output to the AOI polygon without being prompted too - is that a desirable/possible option for acquire_openeo()?

calculate_vi()

  • Example as written produces an all-NaN output

crop_landsat_list()

  • Example as written can't be run (appears to lead on from an example started elsewhere). Wrapping it in 'dontrun' isn't sufficient here, it should work if the user activates it deliberately.

exponential_soil_moisture(), linear_soil_moisture(), optram_calculate_soil_moisture()

  • final command in each example doesn't run - data_dir = might need editing?

optram_acquire_s2()

  • comma missing after remote = "scihub" in final command

optram_calculate_str()

  • The example outputs all have pixels with value ranges of ~1-5, except for a single outlier pixel that has a value of ~58. Is that expected behaviour? Pixel was on a patch edge, so I thought I'd better check.

optram_ndvi_str()

  • parameter VI_list returns NULL

optram_prepare_other_vi_str()

  • a print() command referencing a nonexistent R source file should not be here. This example should be rewritten or removed.

polynomial_coefficients()

  • an example should not simply read in a csv and then reference a temp dir. This example should be rewritten or removed.

store_cdse_credentials()

  • uh, whose credentials are those in the example? Should they be there??

Tests

  • In test-aquire_s2.R, I don't think that test 'AOI file is not spatial' is doing what it claims to. As discussed, given that your functions point to on-disk files rather than making the user read the data in first, I think the closest you could get here is running a mime-type check on the file extension.
  • More importantly, that kind of testing should be in the function itself, not in the test suite. It looks like in general quite a few of your tests are doing checks that should be in the functions themselves. The tests should really be focused more on the outputs of each function than their inputs; making sure they're as expected. They should only fail if you make a change to the package.
  • If you want to test for proper date formatting, you should maybe be internally converting supplied date strings to date-specific data types. I know you probably have to paste them into a string for web calls eventually, but that still gives you more options for handling and error checking.

@micha-silver
Copy link
Author

Hi @obrl-soil
Thanks for the detailed review! We'll start going over, point by point, and let you know when everything has been addressed.
(Can you send me your email, so that I can add you as 'rev' in the DESCRIPTION file?)

@micha-silver
Copy link
Author

@adamhsparks , @obrl-soil , @harryeslick :
In @obrl-soil 's review, she raises an interesting point:

"I don't think your functions should be reading spatial data in for the users, and I'm specifically focusing on the aoi_file parameter here. It creates a lot of room for weird errors which you'll then be pestered about. Instead, I think it would be far safer to require the user to read in a file using sf and supply the resultant object to aoi_file. This then gives you the ability to write unit tests and error catches to cover e.g. correct spatial type supplied, data projected in an acceptable CRS, non-insane extents, etc. You can't do any of that when all you have to work with is a file path string."

I chose to allow users to enter a file name for the area of interest, instead of requiring them to read their spatial file as an sf object, and passing that thru to the various functions. This choice was for two reasons: I use the file name (without file extension) later to add a title to the derived scatter plot. And second, I do not assume users would necessarily have knowledge of the {sf} package. However @obrl-soil suggests not to save the scatterplot to an image file at all, rather let the user do that by herself. So the first reason above would be canceled, leaving only the second reason: user's handling of {sf} objects by herself.

I'd like to ask your opinions if it's reasonable to expect user's to understand the {sf} package well enough to read in their aoi_file, and pass an sf object to the OPTRAM functions. (This would certainly save some complexity in the rOPTRAM package)

Thanks,

@adamhsparks
Copy link
Member

I agree with @obrl-soil on this. Too many edge cases. Users can be expected to have a modest understanding of {sf} to use a package that leverages it. Simplify your package. Future you will thank you.

In the past I added “support” for operations outside the focus of my package, in time I simplified following the UNIX philosophy. Do one thing. Do it well. No one ever complained about the “loss” of functionality as it was already in R with other packages.

@adamhsparks
Copy link
Member

adamhsparks commented Apr 22, 2024

I forgot to do this.

@adamhsparks
Copy link
Member

@ropensci-review-bot submit review #612 (comment) time 6

@ropensci-review-bot
Copy link
Collaborator

Logged review for harryeslick (hours: 6)

@adamhsparks
Copy link
Member

@ropensci-review-bot submit review #612 (comment) time 8

@ropensci-review-bot
Copy link
Collaborator

Logged review for obrl-soil (hours: 8)

@micha-silver
Copy link
Author

Hello all:
First a big thanks to the reviewers for their insightful comments. 👏 👏
We have gone over the list and modified as follows:

  • checked the examples, all should be working (hopefully 😁 );
  • went over the tests, and removed those tests that only checked input (inputs are validated within the functions). Almost all tests now check only for function output;
  • adopted the suggestion to input the area of interest as an sf object;
  • the scatterplot function does not save to a file, rather the function returns a ggplot object for further tweaking by the user;
  • improved documentation where necessary;
  • recompiled vignettes, with a corrected path to images.

Looking forward to any additional suggestions.

@micha-silver
Copy link
Author

Also... I'm looking for advice on one issue:
We implement two functions for building the model from pre-downloaded Landsat, or Sentinel imagery. Users who choose to download manually can build the module using optram_safe() on a directory of *SAFE folders, and optram_landat() with a directory of Landsat folders. I do not see any way to show examples, or write tests since both of these require many GB of imagery as input - way too much for a github repo.
How should I present this? Currently, in the examples, I sidestep the problem and put:

#' \dontrun{
#' aoi <- sf::st_read(system.file("extdata",
#'                               "lachish.gpkg", package = "rOPTRAM"))
#' # landsat_dir is a directory containing the original downloaded Landsat images.
#' landsat_dir <- "...enter full path here..."
#' optram_landsat(landsat_dir,  aoi,
#'                veg_index = 'SAVI',
#'                LC_output_dir = tempdir(), data_output_dir = tempdir())
#' }

@adamhsparks
Copy link
Member

@ropensci-review-bot submit response #612 (comment)

@ropensci-review-bot
Copy link
Collaborator

I'm sorry @adamhsparks, I'm afraid I can't do that. That's something only author1 and author-others are allowed to do.

@adamhsparks
Copy link
Member

Thanks, @micha-silver, @obrl-soil and @harryeslick, can you please check if you are satisfied with the changes and if you have additional comments?

@adamhsparks
Copy link
Member

Also... I'm looking for advice on one issue: We implement two functions for building the model from pre-downloaded Landsat, or Sentinel imagery. Users who choose to download manually can build the module using optram_safe() on a directory of *SAFE folders, and optram_landat() with a directory of Landsat folders. I do not see any way to show examples, or write tests since both of these require many GB of imagery as input - way too much for a github repo. How should I present this? Currently, in the examples, I sidestep the problem and put:

#' \dontrun{
#' aoi <- sf::st_read(system.file("extdata",
#'                               "lachish.gpkg", package = "rOPTRAM"))
#' # landsat_dir is a directory containing the original downloaded Landsat images.
#' landsat_dir <- "...enter full path here..."
#' optram_landsat(landsat_dir,  aoi,
#'                veg_index = 'SAVI',
#'                LC_output_dir = tempdir(), data_output_dir = tempdir())
#' }

This is how I would handle the example.

Do the data requirements expand beyond the need to download the data? That is, once the data is downloaded, is the pathway for the model the same and is therefore tested with the normal tests?

@micha-silver
Copy link
Author

This is how I would handle the example.

Do the data requirements expand beyond the need to download the data? That is, once the data is downloaded, is the pathway for the model the same and is therefore tested with the normal tests?

Yes, that's exactly what happens.

@adamhsparks
Copy link
Member

I'd not worry about tests for this then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants