Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make bikes_allowed a recommended field in GTFS #461

Open
isabelle-dr opened this issue May 22, 2024 · 10 comments
Open

Make bikes_allowed a recommended field in GTFS #461

isabelle-dr opened this issue May 22, 2024 · 10 comments
Labels
Change: Best Practice Changes focusing on recommendations for optimal use of the specification. GTFS Schedule Issues and Pull Requests that focus on GTFS Schedule Support: Needs Feedback

Comments

@isabelle-dr
Copy link
Collaborator

isabelle-dr commented May 22, 2024

Context

As part of our efforts to merge the GTFS Best Practices into the spec, we are:

  1. migrating the current Best Practices into the spec based on community consensus (plan for Schedule, plan for Realtime).

  2. evaluating all outstanding issues and PRs that existed on the Best Practice repos, and proposing new Best Practices to be added directly into the spec, if still relevant.

Scope for this issue

This issue picks-up on MobilityData/GTFS_Schedule_Best-Practices#56 by @bdferris-v2 aiming at making bikes_alloweda recommended field in GTFS.
We would update the presence requirement from Optional to Recommended, and datasets that don't have this fields would get the missing_bike_allowance canonical validator notice for all trips (currently just ferry trips).

Considerations

  1. What are other GTFS fields currently recommended?

feed_start_date, feed_end_date, feed_version.

  1. Why aren't we just making all optional files/fields recommended?

Certain Optional files/fields are dependent on the service (e. g. timeframes.txt is for modeling fares based on time of day), whereas other files can and should always be added regardless of the type of service being modeled (e. g. feed_info.txt). We think there is value in calling out the latter explicitly using the term Recommend to promote higher-quality GTFS.

Would you support this change?
Tagging folks who engaged in adding the Recommended presence into GTFS @e-lo @antrim @bdferris-v2 @NomeQ @gcamp @evansiroky @markstos @westontrillium @derhuerst @doconnoronca @dekarl

@isabelle-dr isabelle-dr added Change: Best Practice Changes focusing on recommendations for optimal use of the specification. Support: Needs Feedback labels May 22, 2024
@gcamp
Copy link
Contributor

gcamp commented May 22, 2024

I agree in principle that bikes_allowed should be recommended, but I find it strange that that field get that treatment before other generally useful optional fields (like trip/stop_headsign or route_color).

Maybe there should be a gradation of recommendation also, to differentiate specific use case (bike trip planning) and more general optional fields that are used in every use case. Maybe "Recommended" and "Suggested"?

@markstos
Copy link
Contributor

Is there a way to review how many GTFS feeds are providing "bikes allowed" in practice currently? If it's a rarely used field, it could be annoying for for operators across the world to update their feeds to add "bikes_allowed:false".

If it's already a popular field, it seems like a clear win.

If there a bunch of operators that do allow bikes in practice but don't reflect that in their GTFS feeds, then I think it could also be a win if pushes a number of operators to have more accurate and complete feeds.

@eliasmbd eliasmbd added the GTFS Schedule Issues and Pull Requests that focus on GTFS Schedule label May 22, 2024
@isabelle-dr
Copy link
Collaborator Author

@markstos good question.

We have 367 datasets that have bike_allowed on the Mobility Database, over 1527 GTFS Schedule datasets -> 25%.

route_color is at 73%, headsigns at 82%

@westontrillium
Copy link
Contributor

@isabelle-dr Interesting data! Is that based on feeds where those fields are defined, not just present as an empty column?

I agree that we should think about how we justify raising this specific field to "recommended" status before other fields/features that are equally, if not more, universally useful. We should also keep in mind that to be a "recommended" component is essentially saying that the best practice is to include it and any feed that does not, is therefore not aligned with best practices, something that can have implications in contractual obligations of vendors to provide data of a certain quality ("...shall provide GTFS in accordance with best practices...") and regional regulatory requirements placed upon agencies.

I like the idea of encouraging this field's use. Maybe I'm making mountains out of molehills, but one issue I see in designating it as a recommended field is the fact that it includes a "no information" option (empty/0). It's unclear what would be gained by introducing this recommendation since there is always going to be the possibility that the exclusion of a 1 or 2 value in that column is justified and intentional because the information is genuinely unknown or unreliable enough to not outrightly express allowance/disallowance of bikes. Is limiting it to "recommended if explicitly allowed/disallowed" enough of a mitigation? How does one independently verify with confidence that "explicit allowance/disallowance of bikes" is not applicable for the trip, and therefore best practices are not violated with absence of 1 or 2? At the very least, a validation tool couldn't do it—it would have to just give a warning that the feed may not follow best practices for every instance that the field is empty/not present.

@bdferris-v2
Copy link
Collaborator

In light of MobilityData/GTFS_Schedule_Best-Practices#56, I don't suppose I can argue against making the field recommended 😇 But I do hear arguments that there may be other fields that might take higher priority.

@isabelle-dr
Copy link
Collaborator Author

Is that based on feeds where those fields are defined, not just present as an empty column?

@westontrillium correct. Our current logic is: the column is present and there are values defined for at least one record (having values for certain records but not all of them seems very rare so we kept it simple).

@isabelle-dr
Copy link
Collaborator Author

isabelle-dr commented Jun 5, 2024

Thank you for providing feedback on this issue!

This issue made me want to look at what are the most common optional features on the Mobility Database: ~1500 GTFS Static datasets across 79 countries - acknowledging that the US accounts for approx half of this right now, so I also check specifically for non-US based data. These are the 6 top represented optional features, in both cases.

feature all data non-US data
Shapes 84% 72%
Headsigns 82% 79%
Route Colors 73% 57%
Wheelchair Accessibility 62% 41%
Feed Information 62% 43%
Location Types 60% 46%

We believe adding recommendations based on what we see in practice is in line with the spirit of the GTFS Best Practices (cc @antrim for more context if needed), so I'm thinking of opening two new issues to make the Route Colors and Headsigns fields recommended. Would you support this?

@isabelle-dr
Copy link
Collaborator Author

Now, to answer @westontrillium's good points:

the effect on contractual obligations ("...shall provide GTFS in accordance with best practices...")

I think this is a desired effect for doing this type of change, no? We could potentially use the Canonical GTFS Schedule Validator's versioning system which has a release when updated with spec evolution. This way, the requirements of a contract wouldn't change if GTFS evolves (but again maybe this is what we want?).

How does one independently verify with confidence that "explicit allowance/disallowance of bikes" is not applicable for the trip, and therefore best practices are not violated with absence of 1 or 2?

I think we can split quality evaluation into:

  1. the data is modeled properly (i. e. the data can be re-used confidently, no bus is going back in time, foreign IDs exist, the station hierarchy makes sense, etc).
  2. the data is comprehensive (i. e. it contains fields/values that improve what riders are seeing).
  3. the data is aligned with the real world (e. g. the stop is actually at this lat/lon, the fare is accurate, bike allowance info is, in fact, not available yet for this agency, etc).

The Spec and Best Practices play a role in 1 and 2, and these can mostly be checked automatically. For 2, in our validator, there is metadata (the dataset contains Fares), or warnings for fields recommended explicitly (missing_recommended_file for feed_info.txt).

Verifying 3 is along the lines of the Grading Scheme.
I think the verification that the dataset accurately represents info accessible in the real world should be done outside the spec, I'd avoid statements such as: "it is recommended to add values 1 and 2 to bike_allowed if the info is available".

@bijustrada360
Copy link

Wouldn't the route colors and shapes be dependent, i.e. why would one provide a route color if the shape for the route is not provided.

@bijustrada360
Copy link

bijustrada360 commented Jul 7, 2024

For wheelchair_accessible and bikesallowed these are both attributes associated with a vehicle as is mentioned in the specifications document. Having said that, most planning / scheduling systems do provide the ability to denote that information at a block level so that dispatchers can assign the appropriate vehicle to the block. But there is no guarantee that on service day a vehicle with the above attributes would be available to service the block that contain those trips. Having that information in the GTFS schedule is nice but it should be appropriately reflected in the GTFS-RT in the event the information changes.

Further more most vehicle manufacturers today are producing vehicles that have low floor and wheelchair accessible and come with configurations to easily install a bike rack. Unless there is a transit agency out there with vehicles in their fleet that are over 10 years in service its most likely they would have this covered for both cases. Now since we are talking transit agencies across the world, I'm sure there would be agencies who still have to manage this information but in the years to come they would be on the decline. We being a GTFS producer recently had to remove the two attributes for a transit agency since they no longer required it.

I may sound contradictory but what I have presented here is what is technically feasible from a transit planning/scheduling/dispatch perspective (first para) and what is most likely the ground reality today or in the near future (second para).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Change: Best Practice Changes focusing on recommendations for optimal use of the specification. GTFS Schedule Issues and Pull Requests that focus on GTFS Schedule Support: Needs Feedback
Projects
None yet
Development

No branches or pull requests

7 participants