Skip to content

mapscanner #330

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
15 of 25 tasks
mpadge opened this issue Jul 25, 2019 · 50 comments
Closed
15 of 25 tasks

mapscanner #330

mpadge opened this issue Jul 25, 2019 · 50 comments

Comments

@mpadge
Copy link
Member

mpadge commented Jul 25, 2019

Submitting Author: mark padgham (@mpadge)
Repository: mpadge/mapscanner
Version submitted: 0.0.1
Editor: @sckott
Reviewer 1: @potterzot
Reviewer 2: @khondula
Archive: TBD
Version accepted: TBD


  • Paste the full DESCRIPTION file inside a code block below:
Package: mapscanner
Title: Print Maps, Draw on Them, Scan Them Back In
Version: 0.0.1
Authors@R: c(
    person("Mark", "Padgham", email="mark.padgham@email.com", role=c("aut", "cre")),
    person("Michael D.", "Sumner", role = "aut", email = "mdsumner@gmail.com"),
    person("Stanislaw", "Adaszewski", role = "cph",
        comment = "author of include concaveman-cpp code"))
Description: Print maps, draw on them, scan them back in, and convert to spatial
    objects.
License: GPL-3 | BSD_2_clause + file LICENSE
Encoding: UTF-8
LazyData: true
NeedsCompilation: yes
Depends: R (>= 3.5.0)
Imports:
    cli,
    curl,
    dplyr,
    fs,
    glue,
    jpeg,
    magrittr,
    magick,
    memoise,
    pdftools,
    png,
    purrr,
    raster,
    Rcpp,
    reproj,
    RNiftyReg,
    sf,
    slippymath,
    tibble,
    RTriangle,
    gibble,
    polyclip,
    unjoin,
    rlang
Suggests: 
    ggplot2,
    knitr,
    lwgeom,
    mapview,
    osmdata,
    rmarkdown,
    testthat,
    spelling
LinkingTo: Rcpp
SystemRequirements:
    C++11
VignetteBuilder: knitr
URL: https://github.com/mpadge/mapscanner
BugReports: https://github.com/mpadge/mapscanner/issues
RoxygenNote: 6.1.1
Language: en-GB

Scope

  • Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):

    • data retrieval
    • data extraction
    • database access
    • data munging
    • data deposition
    • reproducibility
    • geospatial data
    • text analysis
  • Explain how and why the package falls under these categories (briefly, 1-2 sentences):

mapscanner fits nicely into both data extraction and geospatial data. It is definitely geospatial because it concerns maps, and ultimately generates spatial output in standard (sf) format. It is nevertheless primarily a "data extraction" tool that extracts useful (spatially) structured data from unstructured input in either pdf or image formats.

  • Who is the target audience and what are scientific applications of this package?

The target audience comprises social researchers interested in spatial phenomena such as subjective perceptions of regions or areas, or perceptions of location. More specifically, these kinds of analyses are currently restricted to (highly) commercial service provision, and the package aims to aid and empower social researchers in financially disadvantageous circumstances by providing analytic abilities that would otherwise be prohibitively expensive.

No R packages whatsoever that we are aware of. Vaguely similar software exists in other languages, such as mapmatcher (in java; long abandoned) and Maplat (.js; mostly in Japanese, and comparable only in algorithmic terms; purpose is entirely different).

  • If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

Pre-submission issue #322, approved for submission by @noamross 🎆

Technical checks

Confirm each of the following by checking the box. This package:

Publication options

JOSS Options
  • The package has an obvious research application according to JOSS's definition.
  • The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
  • The package is deposited in a long-term repository with the DOI: DOI
  • (Do not submit your package separately to JOSS)
MEE Options
  • The package is novel and will be of interest to the broad readership of the journal.
  • The manuscript describing the package is no longer than 3000 words.
  • You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
  • (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
  • (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
  • (Please do not submit your package separately to Methods in Ecology and Evolution)

Code of conduct

@annakrystalli
Copy link
Contributor

Thanks for the submission @mpadge!

I'm just working on assigning an editor. Stay tuned!

@sckott
Copy link
Contributor

sckott commented Aug 1, 2019

Editor checks:

  • Fit: The package meets criteria for fit and overlap
  • Automated tests: Package has a testing suite and is tested via Travis-CI or another CI service.
  • License: The package has a CRAN or OSI accepted license
  • Repository: The repository link resolves correctly
  • Archive (JOSS only, may be post-review): The repository DOI resolves correctly
  • Version (JOSS only, may be post-review): Does the release version given match the GitHub release (v1.0.0)?

Editor comments

Thanks for your submission @mpadge !

Here's the output from goodpractice. You don't need to address these now, it's info for reviewers to use to get started.

  • the long code line seems fine
  • i think this may be a case of FALSE positives (get it?) for the T/F check in goodpractice, so ignore that
  • I had trouble installing locally. Seems it does install form github okay, but then locally I get an openmp error, the error suggested settting an env var, so Rscript -e 'Sys.setenv(KMP_DUPLICATE_LIB_OK=TRUE); library(devtools); document(); install()' worked
── GP mapscanner ─────
It is good practice toavoid long code lines, it is bad for readability. Also, many people prefer editor windows that are about 80 characters wide. Try make your lines
    shorter than 80 characters

    R/RcppExports.R:19:1avoid 'T' and 'F', as they are just variables which are set to the logicals 'TRUE' and 'FALSE' by default, but are not reserved words and hence can
    be overwritten by the user.  Hence, one should always use 'TRUE' and 'FALSE' for the logicals.

    R/aggregate_polys.R:NA:NA
    R/aggregate_polys.R:NA:NA
 ───────────────

Seeking reviewers now 🕐


Reviewers:

@sckott
Copy link
Contributor

sckott commented Aug 8, 2019

reviewers are @potterzot && @khondula

@mpadge
Copy link
Member Author

mpadge commented Aug 17, 2019

Thanks @sckott - I'm really excited about this package, and looking forward to improvements that will invariably arise in response to reviews. Thanks in advance @potterzot and @khondula for volunteering! The goodpractice T/F outputs are indeed false positives; the long line was not, but has since been fixed.

I'm not sure at the mo what might have caused your install issue, but guess some kind of inter-dependency between imported packages? I'll try a clean docker install asap and see if i can replicate.

@sckott
Copy link
Contributor

sckott commented Aug 23, 2019

@potterzot reminder that your review is due soon

@potterzot
Copy link

potterzot commented Aug 23, 2019

In just under the deadline! @mpadge thank you for your submission, this is a super cool package. We may end up using it in some work we're doing with water rights, so it's great to see! That said, I couldn't actually get it to download maps given a bbox, but looking forward to trying it out once I can.

I printed a USGS topo map and draw a trail on it to check it's application for hiking and was excited with the results. While the package is explicitly aimed at researchers, another use case would be hikers and runners who want to plan a route but don't want to click on an online map. This would let you draw the route, scan it in to the map, and have it uploaded as a shape. If it could be then converted to a GPX file that would be really neat.

I've used the below checklist from the rOpensci review template. Comments at the end are my additions. Please feel free to comment with any clarifying questions.

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s) demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions in R help
  • Examples for all exported functions in R Help that run successfully locally
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).
For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • A short summary describing the high-level functionality of the software
  • Authors: A list of authors with their affiliations
  • A statement of need clearly stating problems the software is designed to solve and its target audience.
  • References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package
    and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 8

  • Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

General Comments

The package seems great. There are some problems I had with checking and running an example, and some code style differences. I left check boxes above unchecked where there were comments I made to suggest changes. The Statement of Need and Vignette sections above are unchecked beccause the README does not have a clear statement of use, although the vignette does. I would suggest just copying that to the README, or possibly making it a bit simpler as well. The vignette is fine but was not building properly for me. The Functionality and Automated Tasks checkboxes are unchecked because I received an error when trying to run the example from the README, detailed below and I think that automated tests of the API may help with identifying possible issues when attempting to download maps. Finally, Package Guidelines is unchecked because of style discrepancies that are detailed below.

Specific Comments

Errors
  • devtools::check() fails with this error when trying to build the vignette:
File mapshot-polys.png not found in resource path
  • When I try to run the code from the README, I get the following error:
bbox <- rbind (c (-117.25000, -117.37500),
               c (45.12500, 45.16667))

ms_generate_map (bbox, max_tiles = 16L, mapname = "eagle_cap")
#> Error in curl::curl_download(url = api_query, outfile) : HTTP error 401.

curl::curl_download() works with a simple html file, so it doesn't seem to be curl that is breaking.

Documentation
  • As discussed in your comment below, add explicit information to the README and vignette about getting and setting a token for map authentication. As part of that I would also add more information about where the maps are being downloaded from.

  • Since I got an error (detailed below) when attempting to generate a map, instead I printed a pdf of a map that I had, drew on it, and then scanned it in. The maps are 2.2MB and 9.9MB. The ms_rectify_maps() function ran for three hours and then it was getting close to the time I had to leave so I killed it. It did not exit cleanly. Providing a check interrupt as in this stackexchange would make it easier to recover if it's taking too long. As it is my R session froze and I had to kill Rstudio and start over.

  • This may be a GB/US difference. Throughout the package the use of the word "scan" to mean "reading an image file into R" is confusing. In the U.S., most users will take this to mean actually using a scanner. For example, the documentation for ms_rectify_maps() reads "Scan in two pdf or png maps, rectify them ...". But the function isn't scanning anything. This happens in other places such as the README. I would suggest working on package and individual function descriptions to clarify that the package deals with images that have been scanned in but doesn't do any scanning. I.e. the package allows users to converted scanned images into sf objects, but doesn't operate the scanner. In the US "read" would probably best replace "scan".

  • README.md: Some expansion of the text would make the package more friendly. The usage section lacks an initial description of the goal of process so it is a bit confusing what it happening (The opening paragraph of the vignette does this well for eample). It's probably worth explaining what "rectification" is for the casual user. Some sentences are incomplete, e.g. "Package comes with ...". The text in the vignette and some of the function descriptions would also be nice in the README, in accordance with the [principle of multiple points of entry:

Is the documentation (installation instructions/vignettes/examples/demos) clear and sufficient? Does it use the principle of multiple points of entry i.e. takes into account the fact that any piece of documentation may be the first encounter the user has with the package and/or the tool/data it wraps?

  • Unexported functions would benefit from comments and/or documentation to explain their purpose. This helps contributors or yourself a year from now. E.g. p_paste and sf_df.

  • It looks like roxygen documentation is being done in a mix of markdown and the default, e.g. type = "polygons" doesn't render nicely in the help file. Instead of backticks, \code{} will render correctly. Alternatively, you can set roxygen to use markdown formatting by default. If you choose that route, then I think default roxygen styling like \pkg{sf} would have to be changed.

  • The note referenced in "see Note" in the ms_rectify_maps() documentation should be under Details. Right now it's included under the Value header.

  • "Automatically save" should be "saved".

  • I was initially confused where the map was stored until I read the second line of the mapname parameter documentation. It would be helpful to add to the function description that it saves the map files in the working directory or optionally in the full path name.

  • The documentation example for ms_generate_map()](https://github.com/mpadge/mapscanner/blob/23ea1378ae6f168eaeccf1df5cc4e24e21ea3ae5/R/generate-maps.R#L26-L31) returns this error:

    Error in curl::curl_download(url = api_query, outfile) : HTTP error 401.
Code Style
  • The use of a space after the function name, e.g. library (mapscanner) doesn't conform to the rOpensci code style, which is also Hadley's style guide. Moreover, in some places there is a space and in others there isn't.
  • Similarly, if() functions should include {} if they extend to the next line, e.g. here.
  • This is more of a matter of preference, but for functions from packages that are used repeatedly throughout a function, I prefer @importFrom rather than ::. The code is cleaner and easier to read. I would do this with sf and dplyr and possibly purrr.
Code
  • this "tell Mike" statement should be expanded into an explicit message or removed entirely.

  • Reflecting this comment, handle setting of API key in an informative way.

@sckott
Copy link
Contributor

sckott commented Aug 24, 2019

thanks for your review @potterzot !

@mpadge
Copy link
Member Author

mpadge commented Aug 24, 2019

Thank you very much @potterzot for your very considered and constructive review. Before going any further can I please ask you to:

  1. Forgive my programmatic stupidity for somehow embedding a token <- Sys.getenv() call in the code and not issue any information whatsoever about the necessity or assumed name of associated token. I've rectified that already, so the code now explicitly informs you of the necessity of such a token and expected naming conventions, and should no longer fail once you've set it. Please accept my humble apologies for that blundering oversight!
  2. Would you be so kind as to try again with current version to ensure that you can successfully generate a map, and maybe amend your review accordingly, or append any new insights you might have?

(I haven't yet updated the README and associated docs & vignettes to more explicitly inform of the necessity of a mapbox token, but will do so asap.) In all other ways, your review is very helpful, and I concur with all points you raise, including my ill-chosen use of "scan" throughout. I'll provide a full response as soon as I return from current travels in 2 weeks or so. Note also that your vision of additional use case is brilliant, and one which absolutely should and will be addressed. It's also one which I've been ignoring because of this issue which details how surprisingly difficult it is to extract lines -- and in particular line start and end points -- from raster objects. I really want to solve this, and your use case provides the perfect impetus. Please feel free to chip in any insights you might have on the associated issue as well, because we'd love any and all help in cracking that one. @sckott Any insights from you would also be hugely appreciated!

@potterzot
Copy link

@mpadge ah okay, no problem. I have added the code and README/vignette revisions to my review comment and am happy to take a second look after your review response.

Adding line extraction would be really neat, but I also don't think it needs to be done as part of the review process. Maybe a later feature down the road.

@sckott
Copy link
Contributor

sckott commented Sep 4, 2019

@khondula plz and thanks, your review is due in a few days

@mpadge
Copy link
Member Author

mpadge commented Sep 5, 2019

Thanks @potterzot for the very productive and helpful review, for which I now offer the following responses, quoting your review comments throughout as appropriate. But first:

need for mapbox API token

The previous inability to use one of the main functions was caused by my hard-coding an API token as environmental variable, as described above. This has been rectified so that the package will automagically work if it detects any kind of environmental variable with "mapbox" or "mapscan" in the name, otherwise it will prompt the user with the following:

"Map generation requires a mapbox API key to be set with Sys.setenv 
with a name that includes either the strings 'mapbox' or 'mapscanner'. 
Tokens can be obtained from 
https://docs.mapbox.com/api/#access-tokens-and-token-scopes"

review responses

There are ... some code style differences.

These should have now been rectified, particularly by rendering the code style of the distinct aggregate_polys.R function to be consistent with the remainder of the package.

The Statement of Need and Vignette sections above are unchecked beccause the > README does not have a clear statement of use, although the vignette does. I > would suggest just copying that to the README, or possibly making it a bit > simpler as well.

and further:

README.md: Some expansion of the text would make the package more friendly. The usage section lacks an initial description of the goal of process so it is a bit confusing what it happening (The opening paragraph of the vignette does this well for eample). It's probably worth explaining what "rectification" is for the casual user. Some sentences are incomplete, e.g. "Package comes with ...". The text in the vignette and some of the function descriptions would also be nice in the README, in accordance with the [principle of multiple points of entry:

The initial vignette sections describing package motivation and principles of usage have been copied to the README as suggested.

The vignette is fine but was not building properly for me... The > Functionality and Automated Tasks checkboxes are unchecked because I received > an error when trying to run the example from the README, detailed below

I presume these issues were because of my previous error in handling mapbox API keys, and hope that it should now build once a key has been set via appropriately named environmental variable. (Sys.setenv ("...MAPBOX...") = <my_key>).

I think that automated tests of the API may help with identifying possible > issues when attempting to download maps.

While the package has 100% coverage, this is admittedly because the bit of code directly responsible for downloading the maptiles is currently not tested. These lines could be mock tested, but would nevertheless require an API key that would have to be hard-coded in one particular way, and would thus still potentially not capture all possible ways by which such keys might be set and used. The new code for dealing with tokens via environmental variables should be failsafe for most use cases, and should issue sufficiently clear guidelines to ensure that tokens are appropriately set, as described above.

Errors

devtools::check() fails with this error when trying to build the vignette:
File mapshot-polys.png not found in resource path

When I try to run the code from the README, I get the following error ... > curl::curl_download() works with a simple html file, so it doesn't seem to be > curl that is breaking.

The latter, and as far as I can tell, the former as well, were both caused by previous lack of appropriate, and appropriately described need for, mapbox token.

Documentation

I would also add more information about where the maps are being downloaded from.

Documentation for ms_generate_map() now states that the function:

Generate(s) a map image for a specified area or bounding box, by downloading tiles from \url{https://mapbox.com}.

(See this commit).

Since I got an error (detailed below) when attempting to generate a map, instead I printed a pdf of a map that I had, drew on it, and then scanned it in. The maps are 2.2MB and 9.9MB. The ms_rectify_maps() function ran for three hours and then it was getting close to the time I had to leave so I killed it. It did not exit cleanly. Providing a check interrupt as in this stackexchange would make it easier to recover if it's taking too long. As it is my R session froze and I had to kill Rstudio and start over.

These issues extend from the RNiftyReg package, for which I have opened an issue requesting an interrupt function. There have also been previous observations that functions in this package do not exit cleanly, but causes and solutions unfortunately lie beyond my control in the current package.

What i have nevertheless done is to implement an automatic reduction in size of large images, as detailed in issue #15. I have fixed the size to ensure files submitted to RNiftyReg are <1MB in size, although there could be more flexible ways to do this? Nevertheless, as noted there, RNiftyReg routines are generally designed to only work with images up to a maximum of 2,048 pixels in any one dimension, so this seems a safe approach. I'd welcome any further thoughts or suggestsions @potterzot ...

This may be a GB/US difference. Throughout the package the use of the word "scan" to mean "reading an image file into R" is confusing. In the U.S., most users will take this to mean actually using a scanner. For example, the documentation for ms_rectify_maps() reads "Scan in two pdf or png maps, rectify them ...". But the function isn't scanning anything. This happens in other places such as the README. I would suggest working on package and individual function descriptions to clarify that the package deals with images that have been scanned in but doesn't do any scanning. I.e. the package allows users to converted scanned images into sf objects, but doesn't operate the scanner. In the US "read" would probably best replace "scan".

As far as I could tell, there was only that one mentioned reference to the package itself doing any "scanning". I have now modified that to say that the function can be used to, "Rectify two previously scanned-in pdf or png maps with RNiftyReg" (see this issue). The word "scan" is indeed used to imply actually using a scanner (or camera, or other equipment).

Unexported functions would benefit from comments and/or documentation to explain their purpose. This helps contributors or yourself a year from now. E.g. p_paste and sf_df.

Ping co-author @mdsumner - could you please PR a fix there?

It looks like roxygen documentation is being done in a mix of markdown and the default, e.g. type = "polygons" doesn't render nicely in the help file. Instead of backticks, \code{} will render correctly. Alternatively, you can set roxygen to use markdown formatting by default. If you choose that route, then I think default roxygen styling like \pkg{sf} would have to be changed.

Sorry, I neglected to include the essential Roxygen: list(markdown = TRUE) in the DESCRIPTION. Now fixed in this commit. (And note that \pkg is still necessary to auto-insert the appropriate link to docs of nominated package.)

The note referenced in "see Note" in the ms_rectify_maps() documentation should be under Details. Right now it's included under the Value header.

...

"Automatically save" should be "saved".

Both of those fixed in this commit - thanks for noting!

I was initially confused where the map was stored until I read the second line of the mapname parameter documentation. It would be helpful to add to the function description that it saves the map files in the working directory or optionally in the full path name.

Done in this commit, so that the initial description of ms_generate_maps() now states that a map is, "automatically saved in both .pdf and .png formats, by default in current working directory, or alternative location when mapname includes the full path."

Code Style

The use of a space after the function name, e.g. library (mapscanner) doesn't conform to the rOpensci code style, which is also Hadley's style guide.

Yeah, but it also does not not conform. Like i always say, "white space let's code breathe". I also explicitly mention my white space preferences in CONTRIBUTING.md.

Moreover, in some places there is a space and in others there isn't.

I think I have found and rectified all previous non-spaced instances.

Similarly, if() functions should include {} if they extend to the next line

That's just another personal preference of mine to exclude {} for single-line clauses, but I do ensure that these have free lines above and below to ensure ease of reading. This reduces total code length by 1 line for each of my single-line if clauses.

This is more of a matter of preference, but for functions from packages that are used repeatedly throughout a function, I prefer @importFrom rather than ::. The code is cleaner and easier to read. I would do this with sf and dplyr and possibly purrr.

Sorry @potterzot, but I have to disagree on that one, for reasons that have been very well established in C++. In short, using @importFrom generates potential namespace confusion, while :: remains the only way in R to definitively avoid that.

Code

this "tell Mike" statement should be expanded into an explicit message or removed entirely.

Done, thanks.

Reflecting this comment, handle setting of API key in an informative way.

Done as described at length above.

Line extraction

Quotes from @potterzot:

While the package is explicitly aieed at researchers, another use case would be hikers and runners who want to plan a route but don't want to click on an online map. This would let you draw the route, scan it in to the map, and have it uploaded as a shape. If it could be then converted to a GPX file that would be really neat.

followed by comment:

Adding line extraction would be really neat, but I also don't think it needs to be done as part of the review process. Maybe a later feature down the road.

I will almost certainly defer this until the end of the review process, as extracting continuous lines from raster data is, as I indicated in my previous comment, exceedingly difficult.

@khondula
Copy link

khondula commented Sep 6, 2019

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s) demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions in R help
  • Examples for all exported functions in R Help that run successfully locally
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).
For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • A short summary describing the high-level functionality of the software
  • Authors: A list of authors with their affiliations
  • A statement of need clearly stating problems the software is designed to solve and its target audience.
  • References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package
    and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 10

  • Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

General

This is a great package with many potential uses and I'm excited to see it being developed! Unfortunately I was not yet able to get the main functionality working using a scanned in document, as every attempt seems to stall out at the "converting to spatial format" step (details below). I look forward to guidance from the authors to help diagnose my issues and get it working. Assuming these can be resolved, my only other concerns are some areas where the documentation could be improved, and a question about whether there is a need for a mapbox account.

Specific comments

  • Convert to spatial format fails: Whereas ms_rectify_maps worked using either the included example, or pdfs annotated using the "marker" in a pdf editor, I was unsuccessful in getting it to work with either a scanned in version of a printed out document, or a photo of the same document. Each attempt seems to complete the rectifying and extracting steps within seconds, and then stalls out during the "converting to spatial format" step. I attempted ~6 versions of annotated maps with various levels of complexity and scanning options (Xerox WorkCentre 4260, cell phone picture). My session info is copied below; please advise on what other information would be useful for troubleshooting - sorry if I am missing something obvious!
R version 3.5.3 (2019-03-11)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
  • Would it be possible to return a message if 'set mapbox token' is completed successfully?
  • I also prefer not having a space between function names and opening parentheses, but defer to author preference.
  • It was unclear to me what the MapBox token is used for. If it is only necessary for acquiring tiles as a rasterbrick, would it be possible to use the rosm package as below, which I don't believe requires signing up for any external account?
library(rosm)
library(sp)
library(sf)
library(dplyr)

northcaro = system.file("shape/nc.shp", package="sf")
nc_sf <- st_read(northcaro)
myextent <- sp::bbox(as(nc_sf, "Spatial"))
osm1 <- osm.raster(myextent)
raster::plotRGB(osm1)
  • I had trouble set the mapbox key as an environment variable in my ~/.Renviron file, error copied below. I did not have any problems using the set_mapbox_token method.
Error in get_mapbox_token() : 
  Map generation requires a mapbox API key to be set with Sys.setenv with a name that includes either the strings 'mapbox' or 'mapscanner'
In addition: Warning message:
In if (length(tok) != 1 | tok == "") stop("Map generation requires a mapbox API key to be set with ",  :
  the condition has length > 1 and only the first element will be used
  • When the modified/annotated map includes polygons but the type argument is "points", would it be possible to include more helpful error message, such as a suggestion to "try type = hull"?
> res <- ms_rectify_maps(f_orig, f_mod, type = "points")
══ mapscanner ══════════════════════════════════════════════════════════════════════════
✔ rectifying the two maps 
✔ extracting drawn objects 
❯ converting to spatial format Error in CPL_geos_op("centroid", x, numeric(0), integer(0), numeric(0),  : 
  Evaluation error: IllegalArgumentException: point array must contain 0 or >1 elements.

Documentation

From readme:

  • "generally be the most convenient for printing, while the rectification itself requires .png-format images" - My understanding is that the conversion to png happens anyway if needed, so this could be taken out of the general intro and only left in the more detailed info.
  • For the initial example, is it potentially confusing to have the main function call wrapped in system.time, since that is not necessary for running the function?
  • "Finally, we can plot the result as an interactive map using packages like mapdeck, or mapview:" Would it be possible to show an example of this code for users unfamiliar with those packages?
  • I agree with the statements about the use of paper maps, however wouldn't there also be use cases where digital annotations (e.g. with a stylus, finger, pdf editor) on a phone/tablet/other device could be saved and used with this package? If so, perhaps some language could be included to clarify that is possible as well.
  • It is unclear what is meant by a "practical workflow". The current language under 'practical tips' seems to imply that other software packages are meant to not be practical. Would it be possible to either clarify, or leave out the phrase about 'unlike most software packages...'? I appreciate that there is a specific workflow/use case in mind, but it is also clear that there are many more potential applications!

From vignette:

  • "enables maps to be printed out" - The wording could be more precise here. I believe it would it be more accurate to say something along the lines of "enables printed out maps to be drawn on and..." since the functions aren't actually doing the printing.
  • Is it possible to add more details to clarify the distinction between type = hulls vs type = polygons?

Other notes

  • One warning during local installation:
In file included from concaveman.cpp:1:
./concaveman.h:587:38: warning: unused typedef 'circ_elem_type' [-Wunused-local-typedef]
    typedef CircularElement<Node<T>> circ_elem_type;
                                     ^
1 warning generated.
  • Warning from github install:
Warning: invalid uid value replaced by that for user 'nobody'
Warning: invalid gid value replaced by that for user 'nobody'
In file included from concaveman.cpp:1:
./concaveman.h:587:38: warning: unused typedef 'circ_elem_type' [-Wunused-local-typedef]
    typedef CircularElement<Node<T>> circ_elem_type;
                                     ^
1 warning generated.
  • Error in devtools::check()
Loading mapscanner
Error in getDLLRegisteredRoutines.DLLInfo(dll, addNames = FALSE) : 
  must specify DLL via a “DLLInfo” object. See getLoadedDLLs()
  • results from good practice check have already been addressed in review.

@sckott
Copy link
Contributor

sckott commented Sep 6, 2019

thanks so much @khondula for your review! 🙏

@mpadge all reviews are in now ...

@mpadge
Copy link
Member Author

mpadge commented Sep 12, 2019

@khondula Before I provide a full response, I wonder if you would be so kind as to send me copies of your failing maps? I am quite concerned that both reviewers have encountered fundamental failures in package functionality, and would myself be loathe to accept a package regardless of all other positive sentiments if I can't get the basic functionality actually working! 😏 If they are not too big, you could maybe just drag them into this issue, and they could be deleted again once I've got them? Or maybe email would be easier - check my gh profile for details. Thanks!

@mpadge
Copy link
Member Author

mpadge commented Sep 25, 2019

So @khondula very kindly sent me all of the maps mentioned above. All of them were complete breaking cases, and addressing them has helped enormously to make the package much more robust. Lots of the internal mechanisms have been completely reconceived. The package now makes an initial guess at number of items to be extracted, then estimates a signal-to-noise threshold to optimally identify those features. There is also an additional optional parameter nitems which, if specified, overrides the automatic estimation, and can improve results for difficult cases. The following table summarises what now happens for each of the previously failing files:

file result?
choptank-annotated.pdf ok
choptank-annotated.png ok
choptank-annotate2.png ok
choptank-annotate3.pdf ok
choptank-annotate4.pdf ok
choptank-annotate-5.pdf auto-detects only 5 items and kinda fails on one, but works perfectly with nitems = 6
choptank-annotate-line.png works perfectly as polygon (but actual line-detection not yet implemented; see #5)
choptank-annotate-points.png works perfectly as points, hulls, and polygons
omaha-pink2.pdf auto-detects only 1 item but then errors appropriately; specifying nitems = 2 works
omaha-pink.pdf works, but inaccurately because fails to trim border

@khondula could you please confirm that all is okay? And if so, it'd be nice if you could close the corresponding issue on the package repo itself. Thank you so much for helping to greatly improve the package already!


Edit: Once you've confirmed that you are now able to get the package working, I'll respond to the rest of your review straight away 😄

@potterzot
Copy link

I spent some time debugging. The tests in test-rectify.R fail because they lack a "nitems = 2" parameter call and the internal code finds the number of components to be zero. I think that this is due to lines 284-287 of R/rectify-maps.R not allowing for the case in which all values of dn are less than or equal to 0, and as a result thr0 ends up as NA if dn has some 0 values and some negative values. Anyway, if I change line 284 to be <=, the tests pass. I've submitted a PR.

As to why this works fine on travis but was failing on my laptop I can't say. The only thing I could think of was perhaps cpp compilers are different but that seems like an obscure error. My Makevars is completely empty for building this, so I shouldn't have any weird tweaks.

A second issue

In the process I realized the vignette contains a lot of hardcoded outputs. I would suggest making these outputs the actual output from the code block. This bit me because when trying to build the vignette I thought it was reporting the actual output of the code block but it wasn't.

@mpadge
Copy link
Member Author

mpadge commented Oct 26, 2019

Thanks @potterzot for the PR fix. That's a great catch that took me a while to understand, but yeah, you're absolutely right that the code without that would have errored whenever all(dn == 0), and that's exactly the fix required. Nice.

As for the second issue, most of the vignette runs except for the bits relating to maps of Chennai. These unfortunately have to have fake output because it just takes too long to compile otherwise, and would almost certainly take too long for CRAN checks. The only other bits that use echo=FALSE are to generate the mapshot images of mapview maps, because these generate local image files on the first run, and then just use these from that point on, to avoid re-generating each time. I think that's okay, but please feel free to suggest any specific alternatives.

@potterzot
Copy link

potterzot commented Oct 26, 2019 via email

@sckott
Copy link
Contributor

sckott commented Oct 28, 2019

Approved! Thanks @mpadge for submitting and @khondula and @potterzot for your reviews!

A few comments:

  • link to the COC from the README
  • on check, there's a package size warning on my macos that it's 5.2 mb, maybe compressing some images may bring that below 5?
  • i couldn't try the mapdeck eg because couldn't get the dev versions of it's dependencies to install from source to get required versions - no worries, let's move on

To-dos:

  • Transfer the repo to rOpenSci's "ropensci" GitHub organization under "Settings" in your repo. you should be able to transfer, let me know if it doesn't work.
  • Add the rOpenSci footer to the bottom of your README
    " [![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)"
  • Fix all links to the GitHub repo to point to the repo under the ropensci organization.
  • If you already have a pkgdown website, fix its URL to point to https://docs.ropensci.org/package_name and deactivate the automatic deployment you might have set up, since it will not be built centrally like for all rOpenSci packages, see http://devdevguide.netlify.com/#docsropensci. In addition, in your DESCRIPTION file, include the docs link in the URL field alongside the link to the GitHub repository, e.g.: URL: https://docs.ropensci.org/foobar (website) https://github.com/ropensci/foobar
  • Add a mention of the review in DESCRIPTION via rodev::add_ro_desc().
  • Fix any links in badges for CI and coverage to point to the ropensci URL. We no longer transfer Appveyor projects to ropensci Appveyor account so after transfer of your repo to rOpenSci's "ropensci" GitHub organization the badge should be [![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/ropensci/pkgname?branch=master&svg=true)](https://ci.appveyor.com/project/individualaccount/pkgname).

For JOSS:

  • Activate Zenodo watching the repo
  • Tag and create a release so as to create a Zenodo version and DOI
  • Submit to JOSS at https://joss.theoj.org/papers/new, using the rOpenSci GitHub repo URL. When a JOSS "PRE REVIEW" issue is generated for your paper, add the comment: This package has been reviewed by rOpenSci: https://LINK.TO/THE/REVIEW/ISSUE, @ropensci/editors

Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent). More info on this here.

We've put together an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here.

@mpadge
Copy link
Member Author

mpadge commented Oct 28, 2019

Thanks @sckott, that is indeed fabulous news. Before proceeding any further, however, I've opened an issue in the repo on something that's been bugging me, and which i want to solve before finishing up here - and which you yourself raised:

on check, there's a package size warning on my macos that it's 5.2 mb, maybe compressing some images may bring that below 5?

It's not the images, it's ggplot2. In particular, the three figures in the vignette which illustrate the peripheral yet (in our opinions) rather cool feature of polygon aggregation. ggplot2 simply results in huge figure sizes here, and it seems nothing can be done about that, other than to maybe switch to base graphics. But a solution is needed, because there is already another vignette demonstrating the different mapstyles available, and the most important one in the pipeline to arise from a large sociological study currently being conducted with the package ... but neither of those vignettes can actually be included with the package until we can reduce the package size, which it seems can't be done with ggplot2 being used. Any and all assistance greatly appreciated, even if only advice to ditch gg and revert to good old base.

@potterzot
Copy link

@mpadge as far as the ggplot size issue goes, one option may be to use the dpi, height, and width parameters of ggsave() to save the images and then read them in, rather than displaying them directly. I think this would give you control over the file size, but still allow dynamic generation of the figures.

@sckott
Copy link
Contributor

sckott commented Oct 28, 2019

sounds good, thanks @mpadge

@sckott sckott closed this as completed Nov 26, 2019
@sckott sckott reopened this Nov 27, 2019
@mpadge
Copy link
Member Author

mpadge commented Dec 1, 2019

@potterzot @khondula @sckott Finally back on to this 🚀 I've done my best to reduce the figure sizes, tried every and all permutation possible, and have now got it down to what I consider a sufficiently small size (always < 3.5MB installed size on all machines I've checked on). Happy to ... finally and hopefully ... progress to final stages now. Thanks for bearing with me there 😄

@sckott
Copy link
Contributor

sckott commented Dec 2, 2019

thanks @mpadge

did another check, all looks good to me.

any other comments @khondula or @potterzot ?

@potterzot
Copy link

All good for me, thanks @mpadge for your work, and congrats!

@sckott
Copy link
Contributor

sckott commented Dec 2, 2019

@mpadge i think we're all good, i don't think we'll hear from kelly

@khondula
Copy link

khondula commented Dec 2, 2019

all good from my end!

@sckott
Copy link
Contributor

sckott commented Dec 2, 2019

ugh, 😢 - foot in mouth - thanks @khondula !

@sckott
Copy link
Contributor

sckott commented Jan 24, 2020

@mpadge are we all set? anything left to do?

@mpadge
Copy link
Member Author

mpadge commented Jan 24, 2020

Yeah, I still have to do the paper.md for JOSS. I've been waiting for some empirical results from a currently-running study, but that's not forthcoming, so I'll write an alternative one instead. I'll let you know as soon as that's ready, and then I think we'll be done. Hopefully in just a few days time.

@mpadge
Copy link
Member Author

mpadge commented Feb 6, 2020

Okay @sckott, thanks for your patience, and now all done I think, including just updating the last check item in the initial list to enable DOI. Good to go?

@sckott
Copy link
Contributor

sckott commented Feb 6, 2020

yes, good to go!

@mpadge
Copy link
Member Author

mpadge commented Mar 19, 2020

@khondula @potterzot Thank you both so much for your very constructive input into the package, which has finally been officially onboarded and transferred to rOpenSci. I'd appreciate it if you'd both be so kind as to consent to being added as reviewers in the package DESCRIPTION file - just a 👍 will do the job, or just let me know if you'd rather not. @sckott Other than that, most of the above item attended to. I'll paste you full list of item and detail everything as soon as that last step is done. Thanks all!

@mpadge
Copy link
Member Author

mpadge commented Mar 20, 2020

@sckott Thanks for all of the much appreciated editorial assistance with this package. I'm very excited that it's finally crossed the threshold over to rOpenSci. Here's you checklist of items again to explicitly confirm that I've attended to all points:

  • link to the COC from the README
  • on check, there's a package size warning on my macos that it's 5.2 mb, maybe compressing some images may bring that below 5? - Rectified as described in issue #27
  • i couldn't try the mapdeck eg because couldn't get the dev versions of it's dependencies to install from source to get required versions - no worries, let's move on - buzz
  • Transfer the repo to rOpenSci'
  • Add the rOpenSci footer to the bottom of your README
  • Fix all links to the GitHub repo to point to the repo under the ropensci organization.
  • If you already have a pkgdown website, fix its URL to point to https://docs.ropensci.org
  • in your DESCRIPTION file, include the docs link in the URL field alongside the link to the GitHub repository
  • Add a mention of the review in DESCRIPTION via rodev::add_ro_desc().

That function no longer works, but I gave due credit to both reviewers in the DESCRIPTION.

  • Fix any links in badges for CI and coverage to point to the ropensci URL.

For JOSS:

  • Activate Zenodo watching the repo
  • Tag and create a release so as to create a Zenodo version and DOI
  • Submit to JOSS at https://joss.theoj.org/papers/new, using the rOpenSci GitHub repo URL.

Last step not currently possible "Due to Walking Dead World's End" (as sign on a bar in my town helpfully declares).


Other than that, it seems to me that that should be sufficient to consider it successfully onboarded, right? 🎉 🕺 Dunno what the status of blog entries and suchlike is in the midst of global restructuring, but whatever can be worked out whenever it might be able to be, i'll be part of it. Thanks!

@potterzot
Copy link

potterzot commented Mar 20, 2020 via email

@mpadge
Copy link
Member Author

mpadge commented Mar 20, 2020

Yeah, I slightly jumped the gun there and added your name to the DESC before you'd consented - thanks!!

@sckott
Copy link
Contributor

sckott commented Mar 20, 2020

it seems to me that that should be sufficient to consider it successfully onboarded, right?

Yes!

Get the JOSS submission done when you can (after they open back up)

@stefaniebutland i think this package would make a nice blog post. how should mark proceed?

@stefaniebutland
Copy link
Member

Sounds good. @mpadge please see our new rOpenSci Blog Guidelines for Authors and Editors for updated instructions for a blog post or tech note.

@annakrystalli
Copy link
Contributor

⚠️⚠️⚠️⚠️⚠️
In the interest of reducing load on reviewers and editors as we manage the
COVID-19 crisis, rOpenSci new submissions for software peer review are paused.

In this period new submissions will not be handled, nor new reviewers assigned.
Reviews and responses to reviews will be handled on a 'best effort' basis, but
no follow-up reminders will be sent.
Other rOpenSci community activities continue.

Please check back here again after 25 May when we will be announcing plans to slowly start back up.

We express our continued great
appreciation for the work of our authors and reviewers. Stay healthy and take
care of one other.

The rOpenSci Editorial Board
⚠️⚠️⚠️⚠️⚠️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants