-
-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gtfsr package #55
Comments
Passes all editors' checks, is a good fit Seeking reviewers... |
Reviewers: @Robinlovelace |
@Robinlovelace |
Thanks for the reminder and opportunity - here is my review of the package: IntroductionGTFS is becoming the worldwide standard for sharing information about public transport systems. This makes it hugely important for the transport industry, public sector organisations trying to direct such systems in socially beneficial directions, and for researchers who have a range of motivations including the need to transition away from fossil fuels to save the world. In any case, GTFS data is largely out of reach for the humble citizen like you and me. Googling "software gtfs data" reveals a cottage industry of GTFS editors and viewers, but little in the way of open source software for accessing this vital new file format. A GIS StackExchange question further shows the lack of cohesion around a single open source product (the World Bank provides a more up-to-date list of tools that can interact with GTFS data). R is a powerful language that provides an emerging set of transport tools via packages such as stplanr, so rOpenSci is the ideal place for the development of a new package to make GTFS available to the world, in the spirit of citizen science. Basic functionalityThe package is installed and loaded easily with: devtools::install_github('ropenscilabs/gtfsr')
library(gtfsr) The most common thing that someone will want to use the package for, I imagine, is read in a GTFS file. This can be done easily and, based on a few tests of feeds from transit.land, reliably: # this one failed for some reason
u = "https://www.bjcta.org/wp-content/uploads/2016/05/BJCTA_GTFS_0516.zip"
# the example in the README works
u = "http://www.co.fairbanks.ak.us/transportation/MACSDocuments/GTFS.zip"
# as does this giant one from Rio de Janeiro
u = "http://dadosabertos2.rio.rj.gov.br/dadoaberto/google-transit/google_transit.zip"
# as does this one from the USA
u = "http://www.bart.gov/dev/schedules/google_transit.zip"
gtfs_obj = import_gtfs(u) The next logical thing you would want to do is plot the result. (Suggestion: make simply loading and plotting GTFS data appear earlier and more prominently in the README.) This can be done easily with the sensible but clunkily-named map_gtfs_agency_routes(gtfs_obj)
map_gtfs_agency_routes(gtfs_obj, include_stops = TRUE) I think in most cases users will want to see the stops. Suggestion: make Accessing GTFS feedsA useful feature of the package, that goes above and beyond other GTFS software, is its ability to search for public GTFS feeds, using the transitfeeds.com API key. The guidance on how to get a key in the README is good. However, advice for storing the key could be better: the if(grepl("[[:alnum:]]{8}\\-[[:alnum:]]{4}\\-[[:alnum:]]{4}\\-[[:alnum:]]{4}\\-[[:alnum:]]{12}", Sys.getenv("GTFS_API_KEY")))
gtfs_api_key$set(Sys.getenv("GTFS_API_KEY")) We use a similar technique in stplanr with In any case, the most important thing is that users can quickly view valid gtfs feeds: feedlist_df = filter_feedlist(get_feedlist()) # errors:
# Error in function_list[[k]](value) : could not find function "mutate" Note that the above code errors, probably because the function After dplyr has been loaded, the function works: library(dplyr)
feedlist_df = filter_feedlist(get_feedlist()) As per the documentation, you can then find feeds from a specific country, e.g. Spain: u_spain = filter(feedlist_df, grepl("Spain", loc_t)) %>%
select(url_d)
# get all gtfs feeds from spain
gtfs_spain = import_gtfs(u_spain) Of the 3 imported GTFS feeds, only 1 can be plotted: map_gtfs_agency_routes(gtfs_spain[[1]]) # fails - not a gtfs object
map_gtfs_agency_routes(gtfs_spain[[2]]) # works
map_gtfs_agency_routes(gtfs_spain[[3]]) # fails - no route_id I think that the fact that only 1 of these Spanish GTFS datasets works says more about the variable quality of GTFS data than the package. However, such testing on a wide range of international examples could help further refine the package and make it more accepting of diverse feed formats. Suggestion: test gtfsr on a wide range of datasets, perhaps using continuous integration, although not sure how this would work with its reliance on external datasets - any ideas? Data consistency issues aside (which are clearly not the fault of gtfsr), the package clearly does a great job of making GTFS data more accessible to the masses. Compatibility with spatial classesTo increase the utility of GTFS objects, I think it would be fantastic if the package included functions to export them as spatial objects. That would expose them to the power of R's impressive GIS capabilities. Suggestion: explore options for converting Note: stplanr has a basic function for this: gtfs2sldf(). DocumentationThe package's vignette is currently the same as the README. This is useful for now but in the long-run the vignettes should be self-standing. Suggestion: make the README shorter and self-standing, make the vignette longer with more links to existing software and documentation for understanding and working with GTFS data. Minor issuesThe first thing I noticed when cloning this repo was its massive size (44 MB). This is predominently due to previously deleted files in the .git repo. It won't cause issues when installing it but will be a deterrent to potential developers. Suggestion: removing the culprits, e.g. using the BFG repo cleaner. |
An additional comment, more one for the longer term, is how can this interact with routing? Are there plans in future versions of gtfsr to allow journey planning, e.g. via an interface to something like OpenTripPlanner? This could help the GTFS data be used for analysis of spatio-temporal accessibility. |
Thanks for the review @Robinlovelace ! |
@ultinomics review is in, let us know if you need any help |
Will do! I plan to go through the edits during the first week of September. On Aug 17, 2016 3:02 PM, "Scott Chamberlain" notifications@github.com
|
okay |
Update: I will be working on this during the last week of September during TransLoc internal "dev days". Thanks everyone and sorry for dragging my feet! |
thanks for the update |
I've gone through most of the review (thanks, @Robinlovelace!) but still need to update the README and vignette and learn how to save the API key to .Renviron. I've also gone through most all the issues that I considered bugs. You can see what I have to work through still on the issues page. Otherwise, the most pressing suggestions have been implemented and are currently on a branch called Let me know what's next! I really pumped about the new, updated mapping feature but would be great to get folks testing it! |
Great work - not had a chance to test it but looking forward to that. |
@ultinomics thanks for the changes. I'll try it out the new map stuff.
In addition to readme and vignette changes ... The API key change thing is important to get done as it's a security issue. Users of this pkg will possibly unintentionally leave their API key in an R script and then put their code up on the web (e.g., on github) for all the world to see. If they store as an env var, that won't happen |
I'll get in it! Good point! On Oct 6, 2016 3:32 PM, "Scott Chamberlain" notifications@github.com
|
@ultinomics i was going to test out but the |
@sckott it was! did a few edits the day after but forgot to update ya. |
Noticed a few more things:
|
@ultinomics any update on your work on this package? |
It is on my to do list for after Dec 2nd. Unfortunately, the last few weeks have been consumed by PhD work.
I’m reserving the package as my “fun thing to do over winter break”. I plan on tackling the API key problem next.
… On Nov 26, 2016, at 6:28 PM, Scott Chamberlain ***@***.***> wrote:
@ultinomics <https://github.com/ultinomics> any update on your work on this package?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#55 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFaR2Rni4pkpzB9MQ445biSlfEj3IFA6ks5rCMCzgaJpZM4I-MZq>.
|
@sckott My bad on close/comment. Alright! I added the API Anything else I can do? I know I need a lot more examples still but I hope it's good enough. |
no worries |
@Robinlovelace any other thoughts at this point? I'm having a last look now |
No more thoughts from me Scott (although I must confess that this is partly due to lack of time) - it looks like a well-documented and useful package! Any ideas on integration with stplanr - lots of talk there about routing, sf and all sorts at the moment. Great work @ultinomics. |
@ultinomics No more things, although do make sure to add more examples - perhaps open an issue https://github.com/ropenscilabs/gtfsr/issues to remind yourself to do that, approved |
Hey @sckott, I was wondering what the next steps are. The on boarding process says "Once your package is approved, we will provide further instructions on transferring your repository to the rOpenSci repository" but no sure if I ever received those instructions. If I did, my apologies! |
Hi @ultinomics -
|
sure, no rush, i think we have a backlog anyway for guest blog posts |
Okay, you're transferred https://github.com/ropensci/gtfsr |
What does this package do? (explain in 50 words or less)
Currently, the package facilitates the import of GTFS data (from url or local path) to create
gtfs
objects, validates file structure ofgtfs
objects, and provides functions to easily plot stops, routes, or networks.Future versions will build validation and modeling capabilities, and options to layer other data.
Paste the full DESCRIPTION file inside a code block below.
URL for the package (the development repository, not a stylized html page)
https://github.com/ropenscilabs/gtfsr
What data source(s) does it work with (if applicable)?
It is designed to work with any text files following the GTFS format. It does not come with any accompanying data.
Who is the target audience?
Transportation Economists, Transportation Consultants and Researchers, Urban Planners, Open Data Enthusiasts
Are there other R packages that accomplish the same thing? If so, what is different about yours?
None
Check the box next to each policy below, confirming that you agree. These are mandatory.
Do you agree to follow the rOpenSci packaging guidelines? These aren't mandatory, but we strongly suggest you follow them. If you disagree with anything, please explain.
R CMD check
(ordevtools::check()
) produce any errors or warnings? If so paste them below.The text was updated successfully, but these errors were encountered: