Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

frequencies.txt exact_times=1 trip_id semantics #227

Closed
derhuerst opened this issue Jun 6, 2020 · 14 comments
Closed

frequencies.txt exact_times=1 trip_id semantics #227

derhuerst opened this issue Jun 6, 2020 · 14 comments
Labels
GTFS Schedule Issues and Pull Requests that focus on GTFS Schedule Status: Stale Issues and Pull Requests that have remained inactive for 30 calendar days or more. Support: Needs Help Needs support to answer outstanding questions and/or feedback.

Comments

@derhuerst
Copy link

I have a question about the semantics of frequencies.txt with exact_times=1.


From my experience with GTFS and my observations of GTFS-based UIs, a lot of tools seem to make the following assumption:

A single trip (as defined by a unique trip_id in the GTFS dataset) is one vehicle (or group of vehicles) that I can use without significant interruptions (such as waiting for another vehicle or chasing to a different line). Often, a single trip is considered to be a vehicle which I can travel with for the whole duration of the trip.

The frequencies.txt documentation seems to undermine that assumption however:

frequencies.txt represents trips that operate on regular headways (time between trips). This file can be used to represent two different types of service.

  • Frequency-based service (exact_times=0) in which service does not follow a fixed schedule throughout the day. Instead, operators attempt to strictly maintain predetermined headways for trips.
  • A compressed representation of schedule-based service (exact_times=1) that has the exact same headway for trips over specified time period(s). In schedule-based service operators try to strictly adhere to a schedule.
Field Name Type Required Description
trip_id ID referencing trips.trip_id Required Identifies a trip to which the specified headway of service applies.

I think this is especially important for routing engines: Now, they can't assume anymore that every GTFS data point referring to the same trip_id is tied to one vehicle allowing continuous travel. There are >=1 "runs" of a vehicle, all under the same trip_id, but each of them ends at the last stop specified in stop_times.txt.


Is my understanding of the semantics correct? If it is, I'd argue that this is quite unintuitive and therefore easy to implement in a wrong way. If I misunderstood how frequencies.txt works, let's improve the documentation.

@skinkie
Copy link
Contributor

skinkie commented Jun 6, 2020 via email

@derhuerst
Copy link
Author

Let's consider an excerpt from the example feed linked in the spec:

trip_id arrival_time departure_time stop_id stop_sequence pickup_type drop_off_type
AWE1 0:06:10 0:06:10 S1 1 0 0
AWE1 S2 2 1 3
AWE1 0:06:20 0:06:30 S3 3 0 0
AWE1 S5 4 0 0
AWE1 0:06:45 0:06:45 S6 5 0 0
trip_id start_time end_time headway_secs
AWE1 05:30:00 06:30:00 300

If I assume that all stop_times.txt/frequencies.txt for AWE1 describe one "run" of one vehicle that I can use continuously, then I could conclude that I can stay in the vehicle from 05:30:00 (earliest start in time frame) until 7:00:00 (latest start in time frame + 35min). This is not the case I assume?

@skinkie
Copy link
Contributor

skinkie commented Jun 6, 2020

It is not one run (or block). It is a normalisation form of transit data. Including a confidence interval of the arrival time of the next trip. Remaining in the vehicle for no particular reason is an activity that is probably allowed if you would have a day ticket, but that is not what this structure (or GTFS) explicitly defines.

@derhuerst
Copy link
Author

It is not one run (or block).

(Not sure what exactly you mean by "run" here, but I will assume you mean what I tried to explain.)

In the GTFS ecosystem, I have often observed the assumption that one GTFS trip corresponds to exactly one "run". Or in plain English: That one GTFS trip means that one vehicle will continuously visit all stops in the trip, without any other trips in between and without additional stops before or after; That after the vehicle has visited all stops in the trip, the "run" is "over".

Making that assumption would probably lead to routing errors (e.g. routes that I actually can't take or that are physically impossible) & unintuitive UIs (e.g. showing the first stop of the trip between other later stops, because another "run" in a time frame of compressed data has started).

If this assumption is not to be made, meaning the stop_times/frequencies feature of GTFS is purely a "normalisation form" to describe when & where any appropriate vehicle of a line will stop, IMO we should clarify this better in the documentation.

(All of this does of course not apply anyways to different schemes of sending vehicles around, like circle-based lines or lines split up by direction.)

@antrim
Copy link
Contributor

antrim commented Jun 8, 2020

GTFS Best Practices offer the below.

Field Name Recommendation
block_id Can be provided for frequency-based trips.

So, that means the following example is valid, and indicates a continuous loop where passengers can stay onboard at stop_A.

stop_times.txt
trip_id arrival_time departure_time stop_id stop_sequence
trip_1 06:10:00 06:10:00 stop_A 1
trip_1 06:15:00 06:15:00 stop_B 2
trip_1 06:20:00 06:20:00 stop_C 3
trip_1 06:25:00 06:25:00 stop_D 4
trip_1 06:30:00 06:30:00 stop_E 5
trip_1 06:35:00 06:35:00 stop_F 6
trip_1 06:40:00 06:40:00 stop_A 7
trips.txt
route_id trip_id service_id block_id
red_loop trip_1 weekday red_loop_block
frequencies.txt
trip_id start_time end_time exact_times headway_secs
trip_1 6:10 18:40 1 1800

Notes

  • My interpretation is that the approach above could also support interlining (Route 1 -> Route 2 -> Route 1, etc) if the trips don't overlap.
  • But if headway_secs was less than the duration of trip_1 (30 min) then this would be an error. The spec doesn't support that right now.

@barbeau
Copy link
Collaborator

barbeau commented Jun 8, 2020

exact_times=1 trips defined in frequencies.txt should be treated the same way as trips defined in a GTFS that doesn't include the frequencies.txt file - you just "unroll" the pattern defined in stop_times.txt into individual trips from the start to end time defined in frequencies.txt, with the start time for each individual trip being headway_secs apart. Note that then arrival_time and departure_time in this case don't refer to absolute times, but rather exist to define the travel time between each stop in the trip. I agree that the documentation could be improved, including examples, to make this clearer.

Note there is another open proposal to better define in-seat transfers and transfer rules at #32.

@derhuerst
Copy link
Author

exact_times=1 trips defined in frequencies.txt should be treated the same way as trips defined in a GTFS that doesn't include the frequencies.txt file - you just "unroll" the pattern defined in stop_times.txt into individual trips [...].

Okay, thanks for clarification.

In this case, I advocate to state clearly in the documentation that one trip_id does not correspond to one "run" (which I tried to define above). From my subjective experience, this seems to be a quite natural assumption.

@antrim
Copy link
Contributor

antrim commented Jun 8, 2020

@derhuerst : I see what you mean. Perhaps a future modification to the spec or training materials could clarify this.

GTFS was created originally with passenger-facing applications in mind, so a "trip" refers to when a vehicle operates on a route. In passenger-facing information, that usually looks like a row on a timetable.

Operational schedules have runs, which would usually consist of multiple "trips" in the passenger-centric sense of GTFS.

Some (non-standard) GTFS datasets do include information on "runs" as you're thinking of them. Discussion in issue #195. Here is an example runcut.txt file: https://openmobilitydata.org/p/ventura-county-transportation-commission/792/latest/file/runcut.txt

@barbeau barbeau added GTFS Schedule Issues and Pull Requests that focus on GTFS Schedule Support: Needs Help Needs support to answer outstanding questions and/or feedback. labels Jun 18, 2020
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Status: Stale Issues and Pull Requests that have remained inactive for 30 calendar days or more. label Nov 17, 2021
@skinkie
Copy link
Contributor

skinkie commented Nov 17, 2021

Keep open.

@github-actions github-actions bot removed the Status: Stale Issues and Pull Requests that have remained inactive for 30 calendar days or more. label Nov 18, 2021
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Status: Stale Issues and Pull Requests that have remained inactive for 30 calendar days or more. label Nov 18, 2022
@github-actions
Copy link

github-actions bot commented Dec 3, 2022

This issue has been closed due to inactivity. Issues can always be reopened after they have been closed.

@huntrob
Copy link

huntrob commented Jun 12, 2024

I'm currently looking at creating a frequencies.txt file for a rail operation. Having the main reference point as trip_id really threw me for a loop as that is not what I had expected to see there. After looking at this for a while, I determined that it uses a trip_id to pull the required data which I believe is route, stop sequence, and the running time. I think this should be clarified as it would save a lot of trial and error for other users.

@skinkie
Copy link
Contributor

skinkie commented Jun 12, 2024

@huntrob consider this trip_id some kind of hash result using the same stop sequence, times between them and calendar. Then this template can be instantiated at different times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GTFS Schedule Issues and Pull Requests that focus on GTFS Schedule Status: Stale Issues and Pull Requests that have remained inactive for 30 calendar days or more. Support: Needs Help Needs support to answer outstanding questions and/or feedback.
Projects
None yet
Development

No branches or pull requests

5 participants