-
-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FedData package in R #13
Comments
@jooolia assigned |
Review and Comments:Overall looks like a useful package for navigating downloading and parsing medium to large sets of data from some US government data collections. Overall all of the functions work, however, I think there can be significant improvements made to the examples and to the documentation that would facilitate the use of the package by users. I have tried to provide general comments with some specific examples in this review. Installation:In README.MD the authors indicate that they have installed the package successfully on Mac OS, Windows and Linux systems. It would be useful to have more information on this and/or a link to helpful information. I easily installed the package on my Windows 7 machine, but after many frustrating hours gave up trying to install the package on Linux Fedora 19 (even with GEOS, etc installed). It is nice that there is code to help install the package for Mac users, but this could be expanded to include other OSs. Specific comments about the scripts and functions:
Example script in FedData.RdI found the use of Line 78 gets truncated when displayed in the help menu. What shows up is"county <- county[!(county$STATE" instead of "county <- county[!(county$STATE %in% c("AK","VI","PR","HI")),]" which is problematic for following the examples. I had problems with the http://websoilsurvey.sc.egov.usda.gov site being down when I was testing the example. Is it commonly unreliable? It took some hunting for me to determine that this was something related to the website and not to my system. Could it be possible to test the state of the website and then return a message to the user letting them know that the website is down? Could be more informative than:
GHCN_FUNCTIONS.R
ITRDB_FUNCTIONS.R
NED_FUNCTIONS.R
NHD_FUNCTIONS.R
SSURGO_FUNCTIONS.R
UTILITY_FUNCTIONS.R
Ropensci criteria:(Based on criteria laid out in ROpenSci packaging guide) Package name:
Function naming:
Coding style:
Readme:
Code of conduct:
Documentation:
Examples:Would be nice to have more. Do you use specific libraries for visualization too? I think it would be very helpful to have simple visualization of the data downloaded and extracted (e.g. plot(map), plot(downloadeddata)). Recommended scaffolding:RopenSci recommends lhttr over Rcurl---could look into that. e.g. 111 in GHCN_FUNCTIONS.R Console messages:
Local Building and Testing of package
I hope these comments are helpful and please let me know if anything is unclear in these comments. |
Wow, I was totally not expecting such a thorough review—it's incredible, @Joolia ! Thank you so much! @sckott I'm at a workshop in Georgia all week, but I'll start adding these as issues tonight/tomorrow, and will work towards addressing them in the evenings and will definitely next week. Thanks again! |
@bocinsky any progress on changes? |
@sckott Fair amount of progress on most of the issues drawn from @jooolia 's review (see issues, etc.). I ended up submitting a new version (2.0.0) to the CRAN that incorporates most updates, plus some spurred by an update to iGraph that was causing CRAN check warnings. One thing I haven't gotten on—and something I want to do prior to resubmitting?—is the transition from Rcurl to httr. Otherwise, expanding the examples and writing a vignette are still on the docket. |
Nice, great progress. Thanks for the update @bocinsky |
@bocinsky Any progress?
You don't need to resubmit. Just let us know when you've made the rcurl to httr changes, and then good to go. Let me know if you need help with that |
@sckott Still haven't made that transition. The issue is that I use time-stamping to ensure that people have the latest version of downloaded files, and I haven't been able to figure out how to make that work with httr. Any help would be much appreciated. The function I would need to port to httr is the "curl_download" function in the UTILITY_FUNCTIONS.R source file. |
thanks for the update. I'll have a look |
Awesome! Thanks Scott! |
so maybe the opts <- list(
verbose = verbose,
noprogress = !progress,
fresh_connect = TRUE,
ftp_use_epsv = FALSE,
forbid_reuse = TRUE,
timecondition = TRUE,
timevalue = base::file.info(destfile)$mtime)
hand <- new_handle()
handle_setopt(hand, .list = opts)
res <- curl::curl_download(url, destfile = temp.file, handle = hand)
let me know what you think . |
Hey @sckott! Thanks so much for your help with this... I was able to successfully migrate away from RCurl, and have pushed those updates. Updating on CRAN now! I've also closed out several of the issues re. the ROpenSci review. Are we good to go for finalizing the onboarding? |
Great, glad it worked.
Just a few things:
|
@bocinsky any thoughts on my above comments? |
@sckott Sorry for the lack of reply! I definitely want to address all of them—the examples are the most important—but I haven't gotten a chance to sit down do them. |
thanks, let us know if you need any help |
@bocinsky Any updates on this? Anything we can help with? |
Hey hey. I just pushed a new version to gitHub and the CRAN that does three things:
The last one is tricky, because testing the principal functions is slow and data-heavy in this package. When I do, I suppose I'll make it so they aren't tested on CRAN. |
@bocinsky Thanks for the change. Would like to see some tests before we approve this. For tests in your pkg, could you have that principal function just do a subset of whatever it is it does so that it runs faster? You can easily skip on CRAN with the fxn |
Thanks so much @sckott. Will work on implementing some tests (mainly having to do with testing whether data URIs are valid) and will be back in touch. |
Hi @sckott I've just pushed a new version that implements testing for all download URLs and some other functions. This has already come in handy, as the USGS changed the path of one of their FTP servers. Checking version 2.1.0 for CRAN submission on win-builder now. |
Great, thanks for the changes. |
@bocinsky So so sorry about the long long time since we revisited this. I think what happened was I applied the
|
@sckott This is great news! As you'll see in the current readme (which I'll update as suggested), I'm currently working on the next major release of FedData, but slowly. Will migrate over to ropensci now, and continue dev on the next version, which will bring integration with I could definitely crank out a tech notes on Cheers! |
Great. All sounds good. Let me know if you have trouble transferring. I'll have our community manager Stefanie get in touch with you about the blog post |
Hi @sckott . Got moved over to ropensci, and submitted documentation changes to CRAN as version 2.4.6. The only thing left to do is to change the URL that points to the pkgdown documentation (which is a mask for the gh-pages site at |
Cool. Try again now on the gh-pages thing - made you admin on that repo |
closing this, thanks again for your submission |
Hi @bocinsky. Thank you for offering to contribute a tech note on We post tech notes as soon as they are submitted and lightly reviewed by one of us, usually me or Scott. Practical instructions: https://github.com/ropensci/roweb/blob/master/.github/CONTRIBUTING.md It would be great if you included a thank you to package reviewers with links to their GitHub or Twitter, maybe point readers to issues and what you think is next to improve the package and invite people to open or address an issue etc. suggested tags: R, community, software, review, onboarding, package, package_name, topic labels used in onboarding review. Ping me here with any questions! |
Hi @stefaniebutland. I just submitted a pull request for a FedData tech note and a headshot and description for the community page. Let me know if there is anything else I need to do! |
Allows for automated geospatial querying and downloading of raw data from several federated databases.
https://github.com/bocinsky/FedData
The National Elevation Dataset (USGS), National Hydrography Dataset (USGS), SSURGO soils database (USGS), Global Historical Climatology Network (NOAA), and the International Tree Ring Databank (NOAA).
Researchers, government employees/land-managers, and anyone else interested in accessing these databases.
Two not on CRAN, but available from http://www.omegahat.org/R.
devtools::check()
produce any errors or warnings? If so paste them below.I'm a fairly novice programmer, so [as far as I know] I don't use Travis CI or unit tests. I've not yet written a vignette (it's on my to-do list).
I think others have suggested FedData to rOpenSci, but the latest version is far more stable and platform-agnostic.
The text was updated successfully, but these errors were encountered: