Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opencage -- a R package for the opencage API (forward and reverse geocoding) #36

Closed
3 tasks
maelle opened this issue Apr 12, 2016 · 41 comments
Closed
3 tasks

Comments

@maelle
Copy link
Member

maelle commented Apr 12, 2016

    1. What does this package do? (explain in 50 words or less)
      This package is an interface to the opencage API (free account: 2500 calls/day). It allows forward geocoding (from placename to lat/lon) and reverse geocoding (vice versa). The API has many data sources.
    1. Paste the full DESCRIPTION file inside a code block below.
Package: opencage
Type: Package
Title: Interface To The OpenCage API
Version: 0.1.0
Authors@R: person("Maëlle", "Salmon", email = "maelle.salmon@yahoo.se", role = c("aut", "cre"))
Description: This packages accesses the OpenCage API which provides Forward
    geocoding (from placename to longitude and latitude) and Reverse geocoding (from
    longitude and latitude to placename)
License: GPL (>= 2)
LazyData: TRUE
URL: http://github.com/masalmon/opencage
BugReports: http://github.com/masalmon/opencage/issues
Imports:
    httr,
    jsonlite,
    dplyr,
    lubridate,
    memoise
RoxygenNote: 5.0.1
Suggests: testthat, lintr
    1. URL for the package (the development repository, not a stylized html page)
      https://github.com/masalmon/opencage
    1. What data source(s) does it work with (if applicable)?
      opencage website states "open geospatial data including OpenStreetMap, Yahoo! GeoPlanet, Natural Earth Data, Thematic Mapping, Ordnance Survey OpenSpace, Statistics New Zealand, Zillow, MaxMind, GeoNames, the US Census Bureau and Flickr's shapefiles plus a whole lot more besides."
    1. Who is the target audience?
      People that need to geocode.
    1. Are there other R packages that accomplish the same thing? If so, what is different about yours?
      I know no function for reverse geocoding but for forward geocoding there's the geocode function in ggmap. "Geocodes a location (find latitude and longitude) using either (1) the Data Science Toolkit (http://www.datasciencetoolkit.org/about) or (2) Google Maps." However it has either Google Maps or Data Science Toolkit. opencage actually builds on other geocoders including Data Science Toolkit ("There's Nominatim, Data Science Toolkit and the Two Fishes geocoders." )
    1. Check the box next to each policy below, confirming that you agree. These are mandatory.
    • [ x] This package does not violate the Terms of Service of any service it interacts with.
    • [ x] The repository has continuous integration with Travis CI and/or another service
    • The package contains a vignette
    • [ x] The package contains a reasonably complete README with devtools install instructions
    • [ x] The package contains unit tests
    • [ x] The package only exports functions to the NAMESPACE that are intended for end users
    1. Do you agree to follow the rOpenSci packaging guidelines? These aren't mandatory, but we strongly suggest you follow them. If you disagree with anything, please explain.
    • Are there any package dependencies not on CRAN?
    • [ x] Do you intend for this package to go on CRAN?
    • [ x] Does the package have a CRAN accepted license?
    • Did devtools::check() produce any errors or warnings? If so paste them below.
    1. Please add explanations below for any exceptions to the above:

The package does not have a vignette yet but it only has two functions with nearly the same arguments. However I agree that it might be good to show use cases, but since I don't geocode a lot in R, I need a bit of feedback about what applications are for users that really geocode.

    1. If this is a resubmission following rejection, please explain the change in circumstances.
@sckott
Copy link
Contributor

sckott commented Apr 12, 2016

Thanks for your submission! Seeking reviewers now

@sckott
Copy link
Contributor

sckott commented Apr 15, 2016

Reviewers: @juliasilge
Due date: 2016-05-06

@maelle
Copy link
Member Author

maelle commented Apr 15, 2016

@juliasilge I'll add info about remaining calls/rate limit in the output this week-end, it's missing right now.

@juliasilge
Copy link

juliasilge commented Apr 27, 2016

General Comments

  • 🙌🌐🙌🌐 This is a well-executed package to call the OpenCage API for forward and reverse geocoding and, in the end, get the results in a list with a nice, convenient data frame. The functions and tests have been crafted carefully so the user interacts with the API well.
  • This package has a clear, sort of minimalist scope; it has just two well-designed functions for the user to interact with and that is great.
  • I have never used memoise before and it is spiffy! This is a great application of it.
  • How to get an API key is made super clear, and the explanations of how to store and use it are also explicitly included, which is awesome. 👍
  • I am confused about how many results I will get when I query, especially for opencage_forward(), and what that means. When I query for Salt Lake City, I get 8 results; when I query for Fort Worth, I get 6 results. I see that the default to return is 10, but what do these multiple results mean? Are they boundary points? Are they different responses from the different sources behind OpenCage like Zillow/Yahoo/US Census/etc? (For both of those U.S. city examples, they seemed to be all from OpenStreetMap, FWIW.) I think the answer to this needs to be clear in the README, the .Rd for forward geocoding at least, and the vignette -- what do the multiple results mean? For some reason the multiple results for reverse geocoding don't seem as puzzling to me, but it still is probably good to clarify this a bit.
  • The tests are thorough, clear, and well written. Almost all of the tests call the API and thus will be skipped on CRAN, but that's just the nature of the beast for this type of package, I think.

Tests

  • All tests run successfully on OSX El Capitan, R 3.2.5, after I stored an OpenCage API key as "OPENCAGE_KEY" in my .Renviron
  • All code in README, vignettes, and examples knit/ran as expected (after I stored an API key)

DESCRIPTION file

  • The title should be title case, which means "to" and "the" should be lower case.
  • The description should be a complete sentence, which only means it needs a period. Also, I believe forward and reverse should be lower case; they aren't proper nouns or anything.
  • Can you make your lists for Imports and Suggests formatted the same? One package on the same line with the "Imports:|Suggests:" then one package on each following line
  • This one is sooooo picky, but before using "which" like you do in the description you need a comma.

Documentation

  • The last example for opencage_forward is too long and is going to get truncated in the PDF manual on CRAN; you can just split that and put it on two lines like you would in an R script.
  • When you are explaining how to type code in documentation (.Rd or vignette) make that more explicit. For example, for how to use bounds in the function documentation, say something like "of a bounding box with \code{bounds =} min long, min lat, max long, max lat. For example, \code{bounds = -0.563160, 51.280430, 0.278970, 51.683979}"
  • Speaking of bounds, isn't this missing a c() or something? I couldn't actually get a bounding box to work. I think we should see a working example with a bounding box in the vignette.
  • Use the same kind of quotes for country code and language code in your documentation, i.e. 'GB' and 'pt-BR', because these are both treated as strings. Actually, why don't you make either one or two examples in the vignette that show you using all of the parameters? It could be one API call with, like, ALL of them, just to show how they work in action.
  • Either capitalize or don't capitalize all the beginning of your explanation of parameters. I actually don't know if one is preferred. May capitalizing? i.e., "Your OpenCage key", "Provides the geocoder...", "Restricts the results...", etc.
  • The paragraph you are using to explain how to get an API key is super important and I am really glad you are so explicit, but it is slightly awkward with passive voice, etc. Since this is used several times, absolutely as it should be, (README, vignette, .Rds), how about some clean-up for clarity and smooth reading? Something along the lines of: "To get an API key to access OpenCage geocoding, register at https://geocoder.opencagedata.com/pricing. The free API key provides up to 2,500 calls a day. For ease of use, save your API key as an environment variable as described at https://stat545-ubc.github.io/bit003_api-key-env-var.html" Don't forget a period after this in the .Rds, but put the bit about memoise AFTER the bit about when OpenCage is updated, I think.
A couple of picky grammar/capitalization/punctuation things...
  • In the function documentation, geocoding should not be capitalized as a proper noun, i.e., "Forward geocoding, from placename to latitude and..." I think I would remove "that is".
  • In the vignette, under "Output" use a semicolon here: "details here and here; for pure conversion"
  • In the vignette, at the end of "Output" either make it two sentences or use a semicolon: "one gets different columns; there can be a lot to explore!"

Functionality

  • I've been thinking about how to structure what you get from the API call, and I think that what you've done here, a list that includes a data frame, is the right choice. This makes good sense.
  • Right now you have calls to data() in utils.R but a) that's not necessary because the package knows about the data files when it is loaded (I think this is how it works?!?!) and b) it's not considered good practice to do that anyway. I tried the package out after removing all of those lines and it works fine.
  • You need to add the names of your data objects (code_message, countrycodes, languagecodes) to a call to globalVariables because R CMD check thinks they have "no visible bindings". I will be honest that I have no idea if this actually does anything of substance beyond quieting a NOTE, but if you are planning to submit to CRAN, then just go ahead. 😒 You can see what it needs to look like here; you should probably put it in a separate file like that.
  • The tests in test-opencage_query_check.R do not skip_on_cran() but they call Sys.getenv("OPENCAGE_KEY"). How is that going to work? That needs to be changed, right?
  • You need to have #' @import memoise in the .Rds for both main functions so that Roxygen knows that you need to import memoise.

@maelle
Copy link
Member Author

maelle commented Apr 28, 2016

Thank you so much, @juliasilge! I will do my homework this week-end. 😊

@maelle
Copy link
Member Author

maelle commented Apr 28, 2016

Thanks again for this very precise review. It was really awesome to get your feedback on this package, @juliasilge! I like memoise very much since Bob (Rudis) told me about it and this package was a great opportunity for me to test it.

I actually had time to make the necessary changes while waiting for other codes to run.

Most changes were straightforward to make because they were well explained + I won't discuss English language issues and I really appreciate your being picky since I'm not a native speaker!

Regarding multiple results, I added a paragraph in the README, vignette and Rd "Regarding multiple results, the API doc states that "In cases where the geocoder is able to find multiple matches, the geocoder will return multiple results. The confidence or coordinates for each result should be examined to determine whether each result from an ambiguous query is sufficiently high to warrant using a result or not. A good strategy to reduce ambiguity is to use the optional bounds parameter described below to limit the area searched." Multiple results might mean you get a result for the airport and a road when querying a city name, or results for cities with the same name in different countries." Is it sufficient?

I added three examples to the README and vignette with bounds, country and language arguments and I corrected the documentation the bounds argument.

I changed the paragrapher about the API key, many thanks for re-phrasing it!

The tests in test-opencage_query_check.R do not call the API but I realized they would mean trying to get an environment variable so I now skip them, thanks for spotting this issue!

I now get only one NOTE, "Authors@R field gives persons with no valid roles: Julia Silge [rev]" -> @sckott I will have to explain this when submitting to CRAN, won't I? Thanks a lot @juliasilge regarding the global bindings. I had no idea what to do with them.

I hope I haven't missed anything. Many thanks for helping me improve the package!

@maelle
Copy link
Member Author

maelle commented Apr 28, 2016

I have a question regarding CRAN submission: the vignette will not build on CRAN, how should I make this clear?

@sckott
Copy link
Contributor

sckott commented Apr 28, 2016

no idea about rev - doesn't seem like a valid role, but i guess others use it?

For the vignette, don't know how to have it not build on CRAN - i cheat by building it to .md, then changing name to .Rmd - so when building, the code blocks are already rendered - i imagine some may disagree with me though on this

@maelle
Copy link
Member Author

maelle commented Apr 28, 2016

@sckott in which repo could I see an example of your vignette strategy/cheating? & do you comment on this in the comments when submitting?

@sckott
Copy link
Contributor

sckott commented Apr 28, 2016

e.g,. https://github.com/ropensci/geojsonio

no - they don't check this - it's only cheating b/c ideally the vignette will build on check/etc. so you know it works, but if you can't expect it to work in all scenarios (e.g., web APIs, env keys needed, etc.) then can build beforehand

the Makefile helps do the build then file rename

@maelle
Copy link
Member Author

maelle commented Apr 28, 2016

Ok, thanks a lot. I hope I'll deal with this well. I'll submit to CRAN once I hear Julia's opinion about the changes.

@richfitz
Copy link
Member

richfitz commented Apr 28, 2016

Despite earlier discussions about rev (cc: @noamross), the full MARC list does not appear available to us: https://github.com/wch/r-source/blob/e5b21d0397c607883ff25cca379687b86933d730/src/library/utils/R/sysdata.R#L28-L37

@maelle
Copy link
Member Author

maelle commented Apr 28, 2016

@richfitz but I guess commenting on rev should be enough? I should try to find another pkg where there is a rev.

@richfitz
Copy link
Member

richfitz commented Apr 28, 2016

I think it's a reasonable field to have, and would be worth seeing if CRAN are willing to accept it. They may update the fields or suggest some other way of handling this that they will tolerate. Add it to your comments and let us know how it goes down

@maelle
Copy link
Member Author

maelle commented Apr 28, 2016

@richfitz ok! This will be my first CRAN submission so I have to brace myself for rejection anyway.

@noamross
Copy link
Contributor

noamross commented Apr 28, 2016

The function used by R CMD check appears to use the whole MARC database:
https://github.com/wch/r-source/blob/e5b21d0397c607883ff25cca379687b86933d730/src/library/utils/R/citation.R#L150-L171

On Thu, Apr 28, 2016 at 9:58 AM Maëlle Salmon notifications@github.com
wrote:

@richfitz https://github.com/richfitz ok! This will be my first CRAN
submission so I have to brace myself for rejection anyway.


You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub
#36 (comment)

@maelle
Copy link
Member Author

maelle commented Apr 28, 2016

@noamross thanks but then why do I get the NOTE on Travis and Appveyor?

Authors@R: c(person("Maëlle", "Salmon", email = "maelle.salmon@yahoo.se", role = c("aut", "cre")), person("Noam", "Ross", role = "ctb"), person("Julia", "Silge", role = "rev"))

@richfitz
Copy link
Member

richfitz commented Apr 28, 2016

probably the --as-cran or other setting that make various bits come on and off on different platforms.

@richfitz
Copy link
Member

richfitz commented Apr 28, 2016

aha, here it is: https://github.com/wch/r-source/blob/0ae4ee0180c641a6bd2e7a8b1d2ea9f21b5f36b7/src/library/tools/R/check.R#L683-L690

(there's a certain irony that the QC system is so hard to navigate through)

@maelle
Copy link
Member Author

maelle commented Apr 28, 2016

@richfitz well it's not hard to navigate through when someone like you gives one all useful links, many thanks! So on CRAN strict should be FALSE? That's nice.

@richfitz
Copy link
Member

richfitz commented Apr 28, 2016

I have no idea - all I know is there are two calls to that function, one with strict=TRUE and one with strict=FALSE. The strict one is probably triggered by running R CMD check with --as-cran, but possibly also involving some of the weird environment variables like _R_CHECK_CRAN_INCOMING_

@maelle
Copy link
Member Author

maelle commented Apr 28, 2016

I have just looked at the help of "? person" as suggested in https://cran.r-project.org/doc/manuals/r-release/R-exts.html and it only shows the "strict" roles and before that it's written that "The new scheme also adds the possibility of specifying roles based on a subset of the MARC Code List for Relators ". So I'm not sure I'll be able to have a 'rev' person, but it's worth trying.

@maelle
Copy link
Member Author

maelle commented Apr 28, 2016

But in https://journal.r-project.org/archive/2012-1/RJournal_2012-1_Hornik~et~al.pdf "**In general, while all MARC relator codes are supported,
the following usage is suggested when giving
the roles of persons in the context of authoring R
packages" -> makes me more optimistic.

@juliasilge
Copy link

juliasilge commented Apr 29, 2016

This is looking SO great! The new examples in the README/vignette are really good and clear, and the package is coming together so well. I only have small changes to suggest now for consistency and readability in the documentation. Thank you so much for adding me as a reviewer in the DESCRIPTION; hopefully CRAN isn't annoyed on my/your account.

README

  • The paragraph about getting the API key still reads kind of awkwardly with the passive voice and all that. Would it be OK with you to use similar wording to what you currently have in the .Rds? So that whole paragraph would read like this: "This package is an interface to the OpenCage API that allows forward and reverse geocoding. To use the package, you will need an API key. To get an API key for OpenCage geocoding, register at https://geocoder.opencagedata.com/pricing. The free API key provides up to 2,500 calls a day. For ease of use, save your API key as an environment variable as described at https://stat545-ubc.github.io/bit003_api-key-env-var.html."
  • Make Sys.getenv("OPENCAGE_KEY") code, and maybe add "manually" at the end of that sentence.
  • At the beginning of the paragraph about multiple results, add one more sentence to make it clear what you're talking about: "Both forward and reverse geocoding typically return multiple results." Then you need a comma in the next sentence: "Regarding these multiple results, the API doc states, "In cases..."
  • Put a comma in "For example, bounds"
  • Don't forget the quotes for "en" in the language section.

Details section in .Rds

  • You could use \url{} (like \url{https://geocoder.opencagedata.com/pricing}) and then the URLs become clickable links.
  • Don't forget a period after the sentence with the URLs, and you have two sentences right now about saving the API key and the link to Jenny Bryan's tutorial in the forward geocoding .Rd. Be sure to take out the second one.
  • When you talk about Sys.getenv("OPENCAGE_KEY") make it \code{}.
  • I think the discussion of multiple results is a little long here for the .Rds (but GREAT for the README and vignette). Maybe something in just one sentence like: "This function typically returns multiple results because of placename ambiguity; consider using the bounds parameter to limit the area searched."

Nitpicky punctuation and spelling details for .Rds

  • Periods at the end of all the descriptions of parameters in .Rds
  • Comma in "For example, \code{bounds"
  • Make sure it is "bounding box", not "boundsing box"
  • Don't forget the quotes for "en" in the language section.
  • For Value of what is returned, either put commas at the end of all of them or don't.

Vignette

  • Take the second part of the title after the dashes and make it a subtitle.
  • Do you think it would be OK to use the paragraph without passive voice, etc about getting an API key here as well? Instead of "For this register..." just start in with "To get an API key for OpenCage geocoding..."
  • Make Sys.getenv("OPENCAGE_KEY") code.
  • Make the same changes to the paragraph about multiple results as in the README.
  • Put a comma in "For example, bounds"
  • Don't forget the quotes for "en" in the language section.

@juliasilge
Copy link

juliasilge commented Apr 29, 2016

Oh, and I have not thought or looked at examples at how to have the vignette not build on CRAN at this point.

@jennybc
Copy link
Member

jennybc commented Apr 30, 2016

I describe how I suppress vignette building on CRAN here in a googlesheets vignette.

@maelle
Copy link
Member Author

maelle commented Apr 30, 2016

Thanks @juliasilge & @jennybc! You make improving this pkg such a pleasure! 😊

@maelle
Copy link
Member Author

maelle commented Apr 30, 2016

@juliasilge I've now made all changes! Unless I forgot one comment 😬. The documentation reads much better now, many thanks! You've been so patient about the bad punctuation!

Once you'll be happy with the changes I'll prepare the CRAN submission using @jennybc's and @sckott's suggestions about the vignette. 😸

@juliasilge
Copy link

juliasilge commented May 2, 2016

I have gone through the package one more time and things look great to me. I would probably recommend making the title and subtitle in the vignette title case("R Package for the OpenCage API" and "Forward and Reverse Geocoding") but other than that, I think this is good to go from my perspective. 👍👍

@maelle
Copy link
Member Author

maelle commented May 3, 2016

Thank you so much, @juliasilge! I will make this correction.

@sckott should my next step be to transfer the repo to ropenscilabs or to submit to CRAN? I won't do it before the week-end anyway I think.

An issue I had was my first name. I wonder which encoding is better for the description file to still be "Maëlle" so I want to check various solutions. When I tried a few days ago to build source and then look at the DESCRIPTION in the .tar.gz my name was Ma?lle or other variations. That said I'm Maëlle in this package so I'll look at this description file. [I sound so narcissistic now...]

@sckott
Copy link
Contributor

sckott commented May 3, 2016

Having a look real quick...

@sckott
Copy link
Contributor

sckott commented May 3, 2016

An issue I had was my first name.

I think you want Encoding: UTF-8 in your DESCRIPTION file

@maelle
Copy link
Member Author

maelle commented May 3, 2016

oooh many thanks!!

Den tisdag, 3 maj 2016 18:50 skrev Scott Chamberlain <notifications@github.com>:

An issue I had was my first name.
I think you want Encoding: UTF-8 in your DESCRIPTION file—
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub

@sckott
Copy link
Contributor

sckott commented May 3, 2016

did that work ?

@maelle
Copy link
Member Author

maelle commented May 3, 2016

Yes it did! This line + making sure I saved it with UTF-8 encoding in RStudio. Thank you!

@sckott
Copy link
Contributor

sckott commented May 3, 2016

@masalmon

should my next step be to transfer the repo to ropenscilabs or to submit to CRAN?

Transfer to ropenscilabs - then submit to CRAN whenever you're ready - (a pkg can be on CRAN already when submitted here of course)

@maelle
Copy link
Member Author

maelle commented May 3, 2016

Ok, I will transfer soon. I need to figure out why I've just broken my build 😁

@maelle
Copy link
Member Author

maelle commented May 3, 2016

I've now transferred the repo, I just need to change the badges in the readme now (I've changed the links and added the rOpenSci things at the bottom of the readme).

I'd like to let this issue open as long as I haven't got the package on CRAN.

@sckott
Copy link
Contributor

sckott commented May 3, 2016

great, thanks!

I'd like to let this issue open as long as I haven't got the package on CRAN.

Will that be soon? We generally close these issues after approval and repo transfer, but if it's soon we can leave this open until then

@maelle
Copy link
Member Author

maelle commented May 3, 2016

Yes that'll be soon but I've changed my mind, I won't change the habits of this repo and I'll ask questions on Slack if I get issues with the vignette. Thanks again to you all!!

@maelle maelle closed this as completed May 3, 2016
@sckott
Copy link
Contributor

sckott commented May 3, 2016

or can ask in your opencage repo - either way is fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants