-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: Have you thought about these validation rules? #1729
Comments
Hi @dancesWithCycles! Thanks for your patience - our team had several different discussions about your proposed rules here.
Could you share some more context for how you know that the stop_times.txt entries should belong to different trips but are associated with the same trip_id? We assume you're deriving this from the same stop being serviced at different times that are extremely close together, like 8am and 8:10am on the same day. But curious to know more. It would be very helpful if you had a feed example with trip rows to include as well.
Could you share examples of feeds where you're seeing this use case? It may warrant an INFO notice in the validator to flag that something looks strange, but we're wondering if there are cases of aggregate feeds where it might be legitimate.
Currently, making this rule dynamic is outside the scope of what's possible with the GTFS validator. However, providing custom validation in the validator has been a long standing feature request that we intend to address in the future (not within the next year though). If you'd like to share your thoughts or needs on this feature, there's an issue for it here.
Similar to the above question, this is out of scope at present because it requires dynamic inputs. However, Transport Data Gouv has a great GTFS diff tool that can help compare two different feed versions and see if a trip count looks dramatically different from how it did previously. We also provide a trip count in the summary of the validation report. Let me know if you have any other questions! |
Hi there,
Cheers! |
Describe the problem
Hi folks,
Thank you so much for maintaining this repository and this rule overview!
I came across the Duplicate Route Name Rule and thought for myself:
According to trips.txt
trip_id
andtrip_short_name
shall be unique (at least on a service day basis). If a GTFS feed is the result of a fusion of many different sources of public transport schedule data from many different providers, it is a common observation (at least for me) thattrip_id
's andtrip_short_name
's are unique considering a single source but not unique anymore in the resulting overall GTFS feed. Looking at the first and last departure stop and time I can tell that severalstop_times.txt
entries shall belong to different trips but have the sametips_id
ortrip_short_name
. Any idea how to tackle this observation with the GTFS validator?How about vice versa? Does anyone already gave a
Duplicate Departure Stop Time Rule
a thought? I am observingstop_times.txt
entries that differ in uniqueagency_id
, uniqueroute_id
, uniquetrip_id
and uniquetrip_short_name
but have the same first and lastdeparture_time
andstop_id
. I can not imagine several trips with identical first and lastdeparture_time
andstop_id
with differentagency_id
,route_id
andtrip_id
. Can someone imagine this observation and has an idea how to tackle it with the GTFS validator?I stumbled over the Trip Coverage Next Days Rule. I like this rule very much. Kudos! I could make use of this rule even more if the number of days would be an argument that I can supply to the GTFS validator as parameter on a GTFS feed specific or costumer specific manner. Any idea if a dynamic rule like this is possible with the current architecture of the GTFS validator?
I am also wondering if we can derive from the Trip Coverage Next Days Rule an
Agency Coverage Trip Count Rule
. I observed many times that agencies provide public transport schedule data only for a subset of trips and not for all trips. In other words, the data delivery is missing the remaining trip subset. As a consequence, you count only a subset of trips per agencies in the resulting GTFS feed. If I provide the GTFS validator with a list of minimum trip counts per agency in a CSV like file, do you think this observation will be tackled by a validation rule? The validator shall tell me the agencies that have trip counts below theminimum trip count per agency
threshold.Cheers!
Describe the new validation rule
Please see above.
Sample GTFS datasets
Please see above.
Severity
Please see above.
Additional context
Please see above.
The text was updated successfully, but these errors were encountered: