-
-
Notifications
You must be signed in to change notification settings - Fork 105
stats19 #266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
thanks for your submission @Robinlovelace We are discussing now and will get back to you soon |
Great. It's in active development as you'll see. We have code to make all those millions of crash points sf data frames, we've just not pushed it. @layik has done great work refactoring the download functions so it makes it much easier to access this important dataset. Looking forward to hearing back from you and thanks for the update. |
Heads-up @sckott |
will dig into it today, i'll let you know |
Editor checks:
Editor commentsThanks for your submission @Robinlovelace ! Here's the output from goodpractice. No need to address these now, it's info for reviewers to use to get started. ── GP stats19 ─────────────
It is good practice to
✖ write unit tests for all functions, and all package code in general. 95% of code lines are covered by test cases.
R/dl_stats19.R:82:NA
R/dl_stats19.R:83:NA
R/format.R:55:NA
R/format.R:106:NA
R/format.R:157:NA
... and 12 more lines
✖ add a "URL" field to DESCRIPTION. It helps users find information about your package online. If your package does not have a homepage, add an URL
to GitHub, or the CRAN package package page.
✖ add a "BugReports" field to DESCRIPTION, and point it to a bug tracker. Many online code hosting services provide bug trackers for free,
https://github.com, https://gitlab.com, etc.
✖ use '<-' for assignment instead of '='. '<-' is the standard, and R users and developers are used it and it is easier to read your code for them if
you use '<-'.
R/dl_stats19.R:28:12
R/dl_stats19.R:30:9
R/dl_stats19.R:31:16
R/dl_stats19.R:32:11
R/dl_stats19.R:34:18
... and 204 more lines
✖ avoid long code lines, it is bad for readability. Also, many people prefer editor windows that are about 80 characters wide. Try make your lines
shorter than 80 characters
R/dl_stats19.R:28:1
R/dl_stats19.R:104:1
R/dl_stats19.R:105:1
R/format.R:42:1
R/format.R:98:1
... and 14 more lines
✖ avoid sapply(), it is not type safe. It might return a vector, or a list, depending on the input data. Consider using vapply() instead.
R/format.R:257:20
✖ avoid 1:length(...), 1:nrow(...), 1:ncol(...), 1:NROW(...) and 1:NCOL(...) expressions. They are error prone and result 1:0 if the expression on
the right hand side is zero. Use seq_len() or seq_along() instead.
R/format.R:247:11
✖ fix this R CMD check NOTE: Namespaces in Imports field not imported from: ‘dplyr’ ‘foreign’ ‘lubridate’ All declared Imports should be used. Seeking reviewers now 🕐 Reviewers:
|
Thanks @sckott. I'd seen goodpractice but never used it - that is very useful. After a few commits stats19 is improved. Latest results:
|
Reviewers assigned:
|
ProlegomenaThanks to rOpenSci for the invitation to review, and special thanks to Robin, Malcolm, Layik, and Mark for this important package. I'm already a huge fan. Package ReviewPlease check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
DocumentationThe package includes all the following forms of documentation:
For packages co-submitting to JOSS
The package contains a
Functionality
Final approval (post-review)
Estimated hours spent reviewing: 3.5 Review Comments
|
thanks very much for your review @daranzolin |
Yeah, thanks @daranzolin for an impressively prompt review. I'll leave the response proper to @layik and @Robinlovelace, but will confess guilty here to having commented out the interactivity in the |
Many thanks for the quick yet detailed review @daranzolin. I'm working on it this week and will get back to you on each of those constructive comments asap. More soon... |
Great review @daranzolin. Appreciate it. Thanks @Robinlovelace and @mpadge. Apologies I just missed this thread for some reason. |
Heads-up all (especially @daranzolin, @mpadge and @layik - the originator of interactive downloads in stats19), I've re-added interactivity, this time using Please check out the video to see how it works here: And if you're interested in the code used to do this, see here: https://github.com/ITSLeeds/stats19/pull/36 Feedback on behaviour or code welcome before I merge this. Many thanks @layik for implementing this 1st time around. |
Just going through the review @daranzolin, RE vignette, I think what you needed to do was devtools::install(build_vignettes = TRUE) then you would be able to load the vignette. |
Heads-up we've addressed most of the issues raised by @daranzolin I think. Many thanks for a great set of comments. We're documenting them here https://github.com/ITSLeeds/stats19/issues/49 and plan to put the review responses here https://github.com/ITSLeeds/stats19/blob/master/responses1.Rmd Not quite finished but plan to properly respond in a comment here by Saturday this week - sorry it's taken a while. |
Hi @daranzolin, as promised by @Robinlovelace find our responses and thank you very much for taking time to do the review and please feel free to let us know if you have any further feedback. ===== Responses to review 1 of stats19Thanks for the review. We've had a chance, after making some changes and fixes to the package, to take-in and act on each of the comments. The code-base has evolved substantially since the review, but the fundamental design of the package, with its 3 stage API mirroring workflows that happened before the package was developed, remains unchanged. That is:
dl_stats19(year = 2017)
# Multiple matches. Which do you want to download?
#
# 1: dftRoadSafetyData_Vehicles_2017.zip
# 2: dftRoadSafetyData_Casualties_2017.zip
# 3: dftRoadSafetyData_Accidents_2017.zip
dl_stats19(year = 2017, type = "ac")
# Files identified: dftRoadSafetyData_Accidents_2017.zip
#
# Wanna do it (y = enter, n = esc)?
dl_stats19(year = 1985)
# Year not in range, changing to match 1979:2004 data
# This file is over 240 MB in size.
# Once unzipped it is over 1.8 GB.
# Files identified: Stats19-Data1979-2004.zip
#
# Wanna do it (y = enter, n = esc)?
Note: the last two stages is now combined by default as per this review We'll focus on areas flagged in the review for the rest of this response:
We have added a map (well technically 9 maps!) and a couple of time series plots showing the scale of the data. Also show a sample of the additional casualty and vehicle tables has been added to show more clearly the richness of data provided.
We also could not see the vignette when installing using This was the code we ran: devtools::install(build_vignettes = TRUE)
vignette(package = "stats19")
These have now been fixed - thanks for testing and reporting.
A CONTRIBUTING is added now. Thank you.
One is added with:
Review Comments
Thank you.
Done, appreciate your input.
DESCRIPTION: has the line LazyData which means stats19_schema is lazy loaded.
Back in, as stated above.
@mem48 answered this: cyipt/stats19 is not actually a proper R package. It is a repo containing scripts for CyIPT project, it has different sources (UK DS), and usage so there is no current need to adapt the use to this package. Malcolm is one of the contributors to this package.
This a reasonable point that we have thought of and discussed. We are open minded about changing the name but, as with so many things, there are +s and -s (outlined for some options below):
The main benefit we can see of changing the name would be making the package easier to find. We think good documentation and clear description and some write-ups of the package and what it does could address these issues. We've explored stat19 name and it links directly to (and is almost synonymous with) road crash data. See https://en.wikipedia.org/wiki/STATS19 for an excellent example (we plan to add this link to the README) so the name is OK for we think, but we're open minded to alternative names mentioned above and perhaps names we've not thought of.
Definitely. Thank you very much for your input. |
Quick question for @sckott, what is you policy regarding CRAN submission while it's under review? We may change this in response the the answer (or we submit tomorrow!): https://github.com/ITSLeeds/stats19/milestone/1. Look forward to your response @adamhsparks. |
Sounds great! I'll mark my calendar |
Heads-up @sckott I've submitted the package to CRAN and have added a Zenodo DOI badge as suggested. I've also submitted the package to JOSS for peer review. Next steps? Starting some work that may cite the package so will useful to have a 'cannonical' .bib file also. |
sounds good @Robinlovelace that's it. will close issue now. yeah, having a citation file is a good idea |
Fantastic! What is best practice when creating a CITATION file? I can see how to create it from the official documentation here: https://cran.r-project.org/doc/manuals/R-exts.html#CITATION-files But see that some packages, such as |
See here for an example of a CITATION file: https://github.com/cran/nlme/tree/master/inst |
And also, what is the official citation info of the package? Will it change when it is published in JOSS? |
official with respect to? I think JOSS would be an additional citation you could be put in the citaiton file |
Just 1 citation for the package is probably enough - that's the aim of the citation("sf")
#>
#> To cite package sf in publications use:
#>
#> Pebesma, E., 2018. Simple Features for R: Standardized Support
#> for Spatial Vector Data. The R Journal,
#> https://journal.r-project.org/archive/2018/RJ-2018-009/
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Article{,
#> author = {Edzer Pebesma},
#> title = {{Simple Features for R: Standardized Support for Spatial
#> Vector Data}},
#> year = {2018},
#> journal = {{The R Journal}},
#> url = {https://journal.r-project.org/archive/2018/RJ-2018-009/index.html},
#> } Created on 2019-01-15 by the reprex package (v0.2.1) The associated I think it may be useful for authors to have guidance from rOpenSci on this, seeing as you provide really useful guidance on, and state preferences on (no problem with that!) many aspects of package design. |
Here's the approach I've taken with my packages on JOSS. I have one for the paper and one for the package that will always give the version used.
|
Many thanks @adamhsparks, very useful. I plan to create something similar to this: https://github.com/ropensci/nasapower/blob/master/inst/CITATION Worth adding something on CITATION in the package guidance/process? |
Hi @Robinlovelace. Can you suggest a date by which you'd like to submit a draft blog post? |
Sure, how's a week tomorrow? In other words, remotes::install_github("ATFutures/calendar")
#> Skipping install of 'calendar' from a github remote, the SHA1 (6c67c0fc) has not changed since last install.
#> Use `force = TRUE` to force installation
library(calendar)
e = ic_event(start_time = "2019-02-14 10:00", end_time = "2019-02-14 10:00", summary = "Complete stats19 blog post")
e
#> # A tibble: 1 x 4
#> UID DTSTART DTEND SUMMARY
#> <chr> <dttm> <dttm> <chr>
#> 1 ical-f3df3945-5e5… 2019-02-14 10:00:00 2019-02-14 12:00:00 Complete sta…
class(e)
#> [1] "ical" "tbl_df" "tbl" "data.frame"
ic_write(e, "e.ics") Created on 2019-02-06 by the reprex package (v0.2.1) |
Resulting Calendar invite (cc @layik ): https://github.com/ATFutures/calendar/releases/download/0.0.1/e.ics |
Checking in to see if you might have a draft post submitted on Friday. Would love to publish 2019-02-19, but we can publish whenever it's ready |
I think that may be doable... |
Here's the blog post: https://github.com/ropensci/stats19/blob/master/vignettes/blog.Rmd Any comments/suggestions welcome. Think this is good to go! |
@stefaniebutland one thing: may be worth holding fire: I've just submitted stats19 0.2.0 to CRAN - will be good if the new features are available on the CRAN version when this drops. |
Another update: after some tweaks the article can be found here: https://itsleeds.github.io/stats19/articles/blog.html |
Update: just saw previos messages and am taking a look here https://github.com/ropensci/roweb2#contributing-a-blog-post |
We can absolutely wait till it's on CRAN. Thank you for getting the draft done. I'll do a proper review once it's submitted as a pull request (so I can check the technical issues at same time as the content) |
@Robinlovelace and @stefaniebutland I have now gone through I will send the PR and we can take it from there. |
sd 'http://www.pct.bike/' 'https://www.pct.bike/' CONTRIBUTING.md CRAN-RELEASE DESCRIPTION LICENSE.md NAMESPACE NEWS.md R R/data.R R/dl.R R/format.R R/get.R R/mot.R R/read.R R/stats19-package.R R/utils.R README.Rmd README.md azure-pipelines.yml codemeta.json cran-comments.md data data/accidents_sample.rda data/accidents_sample_raw.rda data/casualties_sample.rda data/casualties_sample_raw.rda data/file_names.rda data/file_names_old.rda data/police_boundaries.rda data/schema_original.rda data/stats19_schema.rda data/stats19_variables.rda data/vehicles_sample.rda data/vehicles_sample_raw.rda data-raw data-raw/README.md data-raw/misc.Rmd data-raw/schema.Rmd inst inst/2day-slides.Rmd inst/CITATION inst/ggtheme-plot.png inst/iow_example.R inst/national-cycling-data.R inst/rstudio-autocomplete.png inst/stats-19-exercises.Rmd inst/tmap-zones-interactive.png inst/ts_example.R inst/walking-cycling-innovations-slides.Rmd man man/accidents_sample.Rd man/casualties_sample.Rd man/check_input_file.Rd man/check_year.Rd man/dl_stats19.Rd man/figures man/figures/README-crash-date-plot-1.png man/figures/README-crash-time-plot-1.png man/figures/README-dates-1.png man/figures/README-ggplot-1.png man/figures/README-ggplot-ped-severity-1.png man/figures/README-sfplot-1.png man/figures/README-sfplot-2.png man/figures/README-sfplot-3.png man/figures/README-times-1.png man/figures/README-unnamed-chunk-2-1.png man/figures/README-unnamed-chunk-3-1.png man/figures/logo-500.png man/figures/logo-800.png man/figures/logo.png man/figures/logo.svg man/file_names.Rd man/find_file_name.Rd man/format_accidents.Rd man/format_casualties.Rd man/format_column_names.Rd man/format_ppp.Rd man/format_sf.Rd man/format_vehicles.Rd man/get_MOT.Rd man/get_data_directory.Rd man/get_stats19.Rd man/get_url.Rd man/locate_files.Rd man/locate_one_file.Rd man/phrase.Rd man/police_boundaries.Rd man/read_accidents.Rd man/read_casualties.Rd man/read_vehicles.Rd man/schema_original.Rd man/select_file.Rd man/set_data_directory.Rd man/stats19_schema.Rd man/vehicles_sample.Rd paper.bib paper.md responses1.Rmd responses2.Rmd stats19.Rproj tests tests/skip-download.R tests/testthat tests/testthat/test-data_dir.R tests/testthat/test-dl_stats19.R tests/testthat/test-format.R tests/testthat/test-get.R tests/testthat/test-mot.R tests/testthat/test-read_stats19.R tests/testthat/test-utils.R tests/testthat.R updates.sh vignettes vignettes/blog.Rmd vignettes/blog.md vignettes/blog_files vignettes/blog_files/figure-markdown_github vignettes/blog_files/figure-markdown_github/crashes-map-1.png vignettes/blog_files/figure-markdown_github/nfatalities-1.png vignettes/blog_files/figure-markdown_github/sfplot-1.png vignettes/packages.bib vignettes/references.bib vignettes/stats19-training-setup.Rmd vignettes/stats19-training.Rmd vignettes/stats19-vehicles.Rmd vignettes/stats19.Rmd vignettes/wy-overview.jpg sd 'http://www.r-pkg.org/pkg/stats19' 'https://www.r-pkg.org/pkg/stats19' CONTRIBUTING.md CRAN-RELEASE DESCRIPTION LICENSE.md NAMESPACE NEWS.md R R/data.R R/dl.R R/format.R R/get.R R/mot.R R/read.R R/stats19-package.R R/utils.R README.Rmd README.md azure-pipelines.yml codemeta.json cran-comments.md data data/accidents_sample.rda data/accidents_sample_raw.rda data/casualties_sample.rda data/casualties_sample_raw.rda data/file_names.rda data/file_names_old.rda data/police_boundaries.rda data/schema_original.rda data/stats19_schema.rda data/stats19_variables.rda data/vehicles_sample.rda data/vehicles_sample_raw.rda data-raw data-raw/README.md data-raw/misc.Rmd data-raw/schema.Rmd inst inst/2day-slides.Rmd inst/CITATION inst/ggtheme-plot.png inst/iow_example.R inst/national-cycling-data.R inst/rstudio-autocomplete.png inst/stats-19-exercises.Rmd inst/tmap-zones-interactive.png inst/ts_example.R inst/walking-cycling-innovations-slides.Rmd man man/accidents_sample.Rd man/casualties_sample.Rd man/check_input_file.Rd man/check_year.Rd man/dl_stats19.Rd man/figures man/figures/README-crash-date-plot-1.png man/figures/README-crash-time-plot-1.png man/figures/README-dates-1.png man/figures/README-ggplot-1.png man/figures/README-ggplot-ped-severity-1.png man/figures/README-sfplot-1.png man/figures/README-sfplot-2.png man/figures/README-sfplot-3.png man/figures/README-times-1.png man/figures/README-unnamed-chunk-2-1.png man/figures/README-unnamed-chunk-3-1.png man/figures/logo-500.png man/figures/logo-800.png man/figures/logo.png man/figures/logo.svg man/file_names.Rd man/find_file_name.Rd man/format_accidents.Rd man/format_casualties.Rd man/format_column_names.Rd man/format_ppp.Rd man/format_sf.Rd man/format_vehicles.Rd man/get_MOT.Rd man/get_data_directory.Rd man/get_stats19.Rd man/get_url.Rd man/locate_files.Rd man/locate_one_file.Rd man/phrase.Rd man/police_boundaries.Rd man/read_accidents.Rd man/read_casualties.Rd man/read_vehicles.Rd man/schema_original.Rd man/select_file.Rd man/set_data_directory.Rd man/stats19_schema.Rd man/vehicles_sample.Rd paper.bib paper.md responses1.Rmd responses2.Rmd stats19.Rproj tests tests/skip-download.R tests/testthat tests/testthat/test-data_dir.R tests/testthat/test-dl_stats19.R tests/testthat/test-format.R tests/testthat/test-get.R tests/testthat/test-mot.R tests/testthat/test-read_stats19.R tests/testthat/test-utils.R tests/testthat.R updates.sh vignettes vignettes/blog.Rmd vignettes/blog.md vignettes/blog_files vignettes/blog_files/figure-markdown_github vignettes/blog_files/figure-markdown_github/crashes-map-1.png vignettes/blog_files/figure-markdown_github/nfatalities-1.png vignettes/blog_files/figure-markdown_github/sfplot-1.png vignettes/packages.bib vignettes/references.bib vignettes/stats19-training-setup.Rmd vignettes/stats19-training.Rmd vignettes/stats19-vehicles.Rmd vignettes/stats19.Rmd vignettes/wy-overview.jpg sd 'ropensci/software-review#266' 'ropensci/software-review#266' CONTRIBUTING.md CRAN-RELEASE DESCRIPTION LICENSE.md NAMESPACE NEWS.md R R/data.R R/dl.R R/format.R R/get.R R/mot.R R/read.R R/stats19-package.R R/utils.R README.Rmd README.md azure-pipelines.yml codemeta.json cran-comments.md data data/accidents_sample.rda data/accidents_sample_raw.rda data/casualties_sample.rda data/casualties_sample_raw.rda data/file_names.rda data/file_names_old.rda data/police_boundaries.rda data/schema_original.rda data/stats19_schema.rda data/stats19_variables.rda data/vehicles_sample.rda data/vehicles_sample_raw.rda data-raw data-raw/README.md data-raw/misc.Rmd data-raw/schema.Rmd inst inst/2day-slides.Rmd inst/CITATION inst/ggtheme-plot.png inst/iow_example.R inst/national-cycling-data.R inst/rstudio-autocomplete.png inst/stats-19-exercises.Rmd inst/tmap-zones-interactive.png inst/ts_example.R inst/walking-cycling-innovations-slides.Rmd man man/accidents_sample.Rd man/casualties_sample.Rd man/check_input_file.Rd man/check_year.Rd man/dl_stats19.Rd man/figures man/figures/README-crash-date-plot-1.png man/figures/README-crash-time-plot-1.png man/figures/README-dates-1.png man/figures/README-ggplot-1.png man/figures/README-ggplot-ped-severity-1.png man/figures/README-sfplot-1.png man/figures/README-sfplot-2.png man/figures/README-sfplot-3.png man/figures/README-times-1.png man/figures/README-unnamed-chunk-2-1.png man/figures/README-unnamed-chunk-3-1.png man/figures/logo-500.png man/figures/logo-800.png man/figures/logo.png man/figures/logo.svg man/file_names.Rd man/find_file_name.Rd man/format_accidents.Rd man/format_casualties.Rd man/format_column_names.Rd man/format_ppp.Rd man/format_sf.Rd man/format_vehicles.Rd man/get_MOT.Rd man/get_data_directory.Rd man/get_stats19.Rd man/get_url.Rd man/locate_files.Rd man/locate_one_file.Rd man/phrase.Rd man/police_boundaries.Rd man/read_accidents.Rd man/read_casualties.Rd man/read_vehicles.Rd man/schema_original.Rd man/select_file.Rd man/set_data_directory.Rd man/stats19_schema.Rd man/vehicles_sample.Rd paper.bib paper.md responses1.Rmd responses2.Rmd stats19.Rproj tests tests/skip-download.R tests/testthat tests/testthat/test-data_dir.R tests/testthat/test-dl_stats19.R tests/testthat/test-format.R tests/testthat/test-get.R tests/testthat/test-mot.R tests/testthat/test-read_stats19.R tests/testthat/test-utils.R tests/testthat.R updates.sh vignettes vignettes/blog.Rmd vignettes/blog.md vignettes/blog_files vignettes/blog_files/figure-markdown_github vignettes/blog_files/figure-markdown_github/crashes-map-1.png vignettes/blog_files/figure-markdown_github/nfatalities-1.png vignettes/blog_files/figure-markdown_github/sfplot-1.png vignettes/packages.bib vignettes/references.bib vignettes/stats19-training-setup.Rmd vignettes/stats19-training.Rmd vignettes/stats19-vehicles.Rmd vignettes/stats19.Rmd vignettes/wy-overview.jpg
Summary
The goal of stats19 is to make it easy to work with road crash data. Specifically it enables access to and processing of the UK’s official road traffic casualty database, which is called STATS19.
URL for the package (the development repository, not a stylized html page): https://github.com/ITSLeeds/stats19
Please indicate which category or categories from our package fit policies this package falls under *and why(? (e.g., data retrieval, reproducibility. If you are unsure, we suggest you make a pre-submission inquiry.):
data retrieval: downloads and formats publicly available road safety data that is difficult (impossible for most people) to use effectively without
Who is the target audience and what are scientific applications of this package?
Academic, industry and public sector researchers investigating road safety, specifically people wanting to perform natural experimental studies to find out how best to make the transport system safer. It will also be of interest for people interested in large point-on-network data for methodological applications.
yours differ or meet our criteria for best-in-category?
Previous packages/code-bases:
but this package takes the best of the preceding packages and adds new features including:
automated and reproducible data formatting code based on DfT's official guidelines (before labels were hard-coded)
continuous integration
integration with sf (the stplanr implementation used sp)
inclusion of much more data (back to 1974 and until 2017, previous work only covered 1985:2015)
If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.
Requirements
Confirm each of the following by checking the box. This package:
Publication options
paper.md
matching JOSS's requirements with a high-level description in the package root or ininst/
.Detail
Does
R CMD check
(ordevtools::check()
) succeed? Paste and describe any errors or warnings:Does the package conform to rOpenSci packaging guidelines? Please describe any exceptions:
Currently the style uses
=
assignment, which differs from the recommended styleIf this is a resubmission following rejection, please explain the change in circumstances:
If possible, please provide recommendations of reviewers - those with experience with similar packages and/or likely users of your package - and their GitHub user names:
The text was updated successfully, but these errors were encountered: