Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more output from travel_time_matrix() #194

Closed
rafapereirabr opened this issue Aug 24, 2021 · 15 comments
Closed

more output from travel_time_matrix() #194

rafapereirabr opened this issue Aug 24, 2021 · 15 comments
Labels

Comments

@rafapereirabr
Copy link
Member

rafapereirabr commented Aug 24, 2021

@mvpsaraiva , does R5 allow us to get more information from travel_time_matrix(), such as total distance? Or in the case of walking + transit, the walking distance?

@mvpsaraiva
Copy link
Collaborator

mvpsaraiva commented Aug 25, 2021

Yes, we can have lots of extra path information since R5 6.2.

Taking an example from here:

origin destination routes boardStops alightStops rideTimes accessTime egressTime transferTime waitTimes totalTime nIterations
1 2 70|553 1452|86944 86944|88339 16.0|8.0 3.1 13.6 0 1.9|10.0 52.6 3
1 2 70 1452 8846 30 3.1 2.6 0 1.9 37.6 57
1 3 70|61 1452|86944 88333|7783 16.0|17.0 3.1 17 0.6 1.9|8.4 64 14
1 3 70|61 1452|86944 86944|7783 16.0|17.0 3.1 17 9.6 1.9|8.4 72.9 46

The | character separates information in the same column. So, in the first line, the rider took the route 70 on stop 1452 and got off the bus on stop 86944, then they got on route 553 on stop 86944 to stop 88339. We also have access, egress, wait, ride, and transfer times for each leg.

The question is: how much of this information we want to pass along to r5r, and how.

@rafapereirabr
Copy link
Member Author

This is brilliant. Now to the question "how much of this information we want to pass along to r5r, and how." Here are my two cents:

  1. I don't think it would be necessary to output info on routes, boardStops and alightStops. The user could get this from the detailed_itineraries() function.
  2. I think it would be great to keep the columns totalTime, egressTime , transferTime , waitTimes . This would make the output much richer at a very low computational cost. I'm just not sure how these columns could be presented when a user sets more than one percentile, for example percentiles = c(20, 50, 80)

ps. What does the nIterations column mean?

@mvpsaraiva
Copy link
Collaborator

mvpsaraiva commented Aug 25, 2021

Great! I agree that breaking down travel times into segments is quite useful, and requires much less work than breaking down the whole itinerary. Even more so considering that stop_ids alone are not very useful, and we probably would have to fetch their lat lon coordinates as well.

About the percentiles, those travel time segments do not interact directly with them. Basically, we can choose to get the average of those statistics in the time window, or the minimum (the fastest trip in the time window).

ps. What does the nIterations column mean?

From the documentation:

nIterations: number of departure minutes in the departure time window at which this path is optimal.

The optimal path between any pair of OD varies during the time window due to transit schedules, so nIterations indicates how many times that particular itinerary resulted in the shortest travel time.

We have a caveat, though: we can only get detailed path information for up to 5000 destination points. This limit is hardcoded in R5 in this line. Perhaps that's something we can discuss with Conveyal as well.

@cseveren
Copy link

For what it's worth, an expanded travel_time_matrix that reports access and egress times would fit exactly a use case that has pushed me to use detailed_itineraries; I don't need all the detail that the latter command returns, but want to know (e.g.,) how much walking is required for a given fastest trip.

@rafapereirabr
Copy link
Member Author

I'm glad to hear you'll find this useful as well @cseveren .

Great! I agree that breaking down travel times into segments is quite useful, and requires much less work than breaking down the whole itinerary. Even more so considering that stop_ids alone are not very useful, and we probably would have to fetch their lat lon coordinates as well.

About the percentiles, those travel time segments do not interact directly with them. Basically, we can choose to get the average of those statistics in the time window, or the minimum (the fastest trip in the time window).

ps. What does the nIterations column mean?

From the documentation:

nIterations: number of departure minutes in the departure time window at which this path is optimal.

The optimal path between any pair of OD varies during the time window due to transit schedules, so nIterations indicates how many times that particular itinerary resulted in the shortest travel time.

We have a caveat, though: we can only get detailed path information for up to 5000 destination points. This limit is hardcoded in R5 in this line. Perhaps that's something we can discuss with Conveyal as well.

Thanks for the clarification, @mvpsaraiva. Regarding the this hardcoded limit upstream in R5, this should preferably be overwritten by the Java side or r5r, but I'm not sure that's possible. Is it?
If that's not possible, we should check with @ansoncfit and the Conveyal team whether this hardcoded limit could be changed upstream in R5.

@mvpsaraiva
Copy link
Collaborator

It would be quite easy to change that limit in our own R5 jar, but I've just created an issue in Conveyal's repository to suggest a change upstream.

@rafapereirabr
Copy link
Member Author

rafapereirabr commented Oct 21, 2021

Documentation suggestion:

#' @param breakdown logic. If `FALSE` (default), the function returns a simple 
#'                  output with columns origin, destination and travel time 
#'                  percentiles. If `TRUE`, r5r breaks down the trip information
#'                  and returns more columns with estimates of `access_time`,
#'                  `waiting_time`, `ride_time`, `transfer_time`, `total_time` , `n_rides`
#'                  and `route`. Warning: Setting `TRUE` makes the function
#'                  significantly slower.
#'   
#' @param breakdown_stat string. If `min`, all the brokendown trip informantion 
#'        is based on the trip itinerary with the smallest waiting time in the 
#'        time window. If `breakdown_stat = mean`, the information is based on 
#'        the trip itinerary whose waiting time is the closest to the average 
#'        waiting time in the time window.

@rafapereirabr
Copy link
Member Author

@mvpsaraiva , I think we are ready to merge the dev into the master branch, Right?

@mvpsaraiva
Copy link
Collaborator

@mvpsaraiva , I think we are ready to merge the dev into the master branch, Right?

Agreed!

@CWen001
Copy link

CWen001 commented Feb 19, 2022

Hello, @rafapereirabr, @mvpsaraiva. Is it still planned to add the information by distance (total distance; walking distance to/from the transit)?

For now, I'm wondering if there are some best practices for users to calculate distances from the output of travel time? I saw in the documentation that the default average walking speed is 3.6 Km/h and cycling speed 12 Km/h. But for bus and other transit, the converting via speed might be not straightforward to estimate the total distance.

Thank you very much for this powerful package.

@rafapereirabr
Copy link
Member Author

Hi @CWen001 . I'm not entirely sure it's possible to extract trip distance information from R5, but @mvpsaraiva will be able to confirm that.

In any case, it can be tricky to get distance info for public transport trips. This is because trip distance info depends on the shapes.txt file in the GTFS input, and many GTFS feeds do not have that file.

@mvpsaraiva
Copy link
Collaborator

Hi @CWen001. The only way you can get information on travel distances is with detailed_itineraries(). The outputs that R5 provides for travel_time_matrix() don't include that information, so we can't pass it along to r5r users. Conveyal would need to update R5 to compute that information, but I'm quite sure this is not in their plans (for many technical and practical reasons).

Calculating walking and cycling distances from times is relatively straightforward, but it's not 100% accurate. You wouldn't be considering topography, for example, or turn restrictions. I also believe R5 may add a small penalty to walking times when pedestrians need to cross busy/large streets.

Travel distances by public transport is even more complicated and, as @rafapereirabr said, impossible in many situations.

@CWen001
Copy link

CWen001 commented Feb 20, 2022

Thank you very much for your replies, @rafapereirabr, @mvpsaraiva. Now I understood the complexities under the hood.

One use case for our urban planning department is to calculate a large travel time matrix. But somehow the (default settings of) time results from r5r is about half the time compared to what we sampled to test using the same origins and dests by google services. As local people are very familiar with and sensitive to the travel time in the city, they might question the results. Since the distance is unavailable, I think the way we should go is to carefully study/test each parameter in the function travel_time_matrix() for tuning the time results.

Still, a lot of respect for the wonderful open-source package and the developers.

@rafapereirabr
Copy link
Member Author

@CWen001 , are you using the exact same GTFS feed in r5r and in google services? I'm curious about the root cause of this large difference, but we need to make sure we're comparing the same gtfs input.

@mvpsaraiva
Copy link
Collaborator

I agree with @rafapereirabr that such a large difference (half the time!) is much likely to be due to differences between the GTFS feeds.

One experiment you can try is to set the parameters time_window = 60 and percentiles = c(5, 25, 50, 75, 95). Then you can see if any of those travel time percentiles are close to Google's estimates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants