Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New detailed_itineraries() #265

Closed
mvpsaraiva opened this issue Jun 3, 2022 · 7 comments
Closed

New detailed_itineraries() #265

mvpsaraiva opened this issue Jun 3, 2022 · 7 comments

Comments

@mvpsaraiva
Copy link
Collaborator

I have rebuilt the detailed_itineraries() function to make it faster, more reliable, and add some cool new features. It's on the dev branch now, and I think we can move it to the master branch after we update the tests and documentation.

Some of the new features:

  • New time_window parameter;
  • New suboptimal_minutes parameter, which extends the search space and returns a larger number of trips beyond the fastest ones;
  • Support for fare calculator and new max_fare parameter;
  • Routing in frequencies GFTS, including support for Monte Carlo draws (we need to test this further, as there is some randomness involved and the results may vary unpredictably)

Reproducible example:

# load libraries
library("r5r")
library("data.table")
library("mapview")


# build transport network
data_path <- system.file("extdata/poa", package = "r5r")
r5r_core <- setup_r5(data_path)

# load origin/destination points
points <- read.csv(file.path(data_path, "poa_points_of_interest.csv"))

# load fare structure object
fare_structure_path <- system.file(
  "extdata/poa/fares/fares_poa.zip",
  package = "r5r"
)
fare_structure <- read_fare_structure(fare_structure_path)

# inputs
departure_datetime <- as.POSIXct("13-05-2019 14:00:00",
                                 format = "%d-%m-%Y %H:%M:%S")

# compute paths
dit <- detailed_itineraries(r5r_core,
                            origins = points[10,],
                            destinations = points[12,],
                            mode = c("WALK", "TRANSIT"),
                            departure_datetime = departure_datetime,
                            max_walk_dist = 1000,
                            max_trip_duration = 60,
                            fare_structure = fare_structure,
                            max_fare = 9,
                            shortest_path = F)

# visualise results
mapview(dit, zcol = "mode")

Map:
Screenshot 2022-06-03 at 11 26 28

Data:

from_id from_lat from_lon to_id to_lat to_lon option departure_time total_duration total_distance total_fare segment mode cumulative_fare segment_duration wait distance route
1 farrapos_station -29.997721 -51.197618 praia_de_belas_shopping_center -30.049951 -51.228752 1 14:00:59 44.3333333333333 9460 8.37 1 WALK 0 4.56666666666667 0 174
2 farrapos_station -29.997721 -51.197618 praia_de_belas_shopping_center -30.049951 -51.228752 1 14:00:59 44.3333333333333 9460 8.37 2 RAIL 4.5 6.58333333333333 9.45 4796 LINHA1
3 farrapos_station -29.997721 -51.197618 praia_de_belas_shopping_center -30.049951 -51.228752 1 14:00:59 44.3333333333333 9460 8.37 3 WALK 4.5 5.73333333333333 0 256
4 farrapos_station -29.997721 -51.197618 praia_de_belas_shopping_center -30.049951 -51.228752 1 14:00:59 44.3333333333333 9460 8.37 4 BUS 8.37 10.4166666666667 4.41666666666667 4083 188
5 farrapos_station -29.997721 -51.197618 praia_de_belas_shopping_center -30.049951 -51.228752 1 14:00:59 44.3333333333333 9460 8.37 5 WALK 8.37 3.16666666666667 0 151
6 farrapos_station -29.997721 -51.197618 praia_de_belas_shopping_center -30.049951 -51.228752 2 14:00:59 52.1166666666667 8310 7.2 1 WALK 0 7.96666666666667 0 347
7 farrapos_station -29.997721 -51.197618 praia_de_belas_shopping_center -30.049951 -51.228752 2 14:00:59 52.1166666666667 8310 7.2 2 BUS 4.8 15.8666666666667 2.18333333333333 4310 703
8 farrapos_station -29.997721 -51.197618 praia_de_belas_shopping_center -30.049951 -51.228752 2 14:00:59 52.1166666666667 8310 7.2 3 WALK 4.8 12.5666666666667 0 653
9 farrapos_station -29.997721 -51.197618 praia_de_belas_shopping_center -30.049951 -51.228752 2 14:00:59 52.1166666666667 8310 7.2 4 BUS 7.2 7.5 2.86666666666667 2849 179
10 farrapos_station -29.997721 -51.197618 praia_de_belas_shopping_center -30.049951 -51.228752 2 14:00:59 52.1166666666667 8310 7.2 5 WALK 7.2 3.16666666666667 0 151
@mvpsaraiva
Copy link
Collaborator Author

If you need lots of path alternatives, the new detailed_itineraries() is much more flexible. By changing the suboptimal_minutes and time_window parameters you can significantly increase the number of path alternatives retrieved by r5r.

The plot below shows that you can get up to 134 different alternatives between the same origin/destination pair when suboptimal_minutes and time_window are set to 15 minutes, at the cost of longer computation times:

Screenshot 2022-06-04 at 11 36 59

Obviously, those routes are far from the best alternatives and sometimes have very large detours:

Screenshot 2022-06-04 at 11 45 05

The following table shows all the itineraries mapped above:

walk walk|187|T1|walk walk|394|T1D|walk walk|165|T7|T7|walk walk|263|walk|T7|walk walk|195|walk|T1|walk walk|2821|walk|T7|173|walk
walk|188|walk walk|165|T1D|walk walk|188|T1D|walk walk|262|walk|375|walk walk|493|walk|T7|walk walk|C1|T7|186|walk walk|268|264|walk|T1D|walk
walk|165|walk walk|346|T1|walk walk|179|T1D|walk walk|253|T7|179|walk walk|346|walk|T7|walk walk|375|walk|T7|walk walk|346|walk|T7|186|walk
walk|187|walk walk|2821|T1|walk walk|179|T1|walk walk|2821|walk|T7|walk walk|491|walk|T7|walk walk|375|walk|T1|walk walk|280|walk|T7|186|walk
walk|173|walk walk|263|T1|walk walk|394|T1|walk walk|165|T7|173|walk walk|188|T7|186|walk walk|375|walk|T1D|walk walk|491|walk|T7|186|walk
walk|179|walk walk|255|T1D|walk walk|188|T1|walk walk|253|T7|186|walk walk|262|T7|186|walk walk|274|walk|T7|T7|walk walk|429|walk|T7|186|walk
walk|186|walk walk|173|T1D|walk walk|2441|T1|walk walk|165|T7|187|walk walk|165|T7|186|walk walk|274|walk|T7|173|walk walk|493|walk|T7|186|walk
walk|264|walk walk|397|T1D|walk walk|186|T1|walk walk|187|T7|179|walk walk|210|walk|T1D|walk walk|2821|T1D|walk|187|walk walk|494|walk|T7|186|walk
walk|253|T7|walk walk|165|T1|walk walk|C1|T7|walk walk|264|walk|T7|walk walk|260|walk|T1D|walk walk|262|T1D|walk|187|walk walk|263|walk|T7|186|walk
walk|253|T1D|walk walk|C1|T1|walk walk|C1|T1D|walk walk|253|walk|375|walk walk|348|walk|T1D|walk walk|346|T1D|walk|187|walk walk|3441|walk|T7|186|walk
walk|268|264|walk walk|255|T1|walk walk|253|walk|187|walk walk|187|T7|186|walk walk|187|walk|T1D|walk walk|268|264|walk|T7|walk walk|2821|walk|T7|186|walk
walk|244|T1|walk walk|173|T1|walk walk|268|walk|T1D|walk walk|274|walk|375|walk walk|3441|walk|T1D|walk walk|253|T1D|walk|T7|walk walk|441|walk|T7|186|walk
walk|262|T1|walk walk|397|T1|walk walk|268|walk|T1|walk walk|253|T7|188|walk walk|263|walk|T1D|walk walk|R41|walk|T7|T7|walk walk|375|walk|T7|186|walk
walk|274|T1D|walk walk|187|T7|walk walk|253|T7|T7|walk walk|173|walk|187|walk walk|195|T7|186|walk walk|274|T1D|walk|T7|walk walk|253|walk|187|walk|188|walk
walk|346|T1D|walk walk|165|T7|walk walk|274|walk|T7|walk walk|187|walk|188|walk walk|195|walk|T1D|walk walk|2821|walk|T7|T7|walk
walk|2821|T1D|walk walk|188|T7|walk walk|187|T7|T7|walk walk|397|walk|375|walk walk|3441|walk|T1|walk walk|274|walk|T7|179|walk
walk|262|T1D|walk walk|262|T7|walk walk|253|T7|173|walk walk|3441|walk|T7|walk walk|210|walk|T1|walk walk|346|T1D|walk|T7|walk
walk|165|264|walk walk|280|T1|walk walk|187|T7|173|walk walk|494|walk|T7|walk walk|348|walk|T1|walk walk|274|walk|T7|186|walk
walk|253|T1|walk walk|195|T7|walk walk|188|walk|165|walk walk|429|walk|T7|walk walk|260|walk|T1|walk walk|R41|walk|T7|173|walk
walk|274|T1|walk walk|2441|T1D|walk walk|R41|walk|T7|walk walk|280|walk|T7|walk walk|441|walk|T7|walk walk|R41|walk|T7|187|walk

@mvpsaraiva
Copy link
Collaborator Author

Now you can also use detailed_itineraries() to calculate the monetary cost of trips and produce Pareto frontiers of travel time and cost between single OD pairs (for multiple pairs, check the pareto_frontier() function). This is also very useful to debug the fare calculator settings while you're configuring them. See example below:

# load packages
library("r5r")
library("dplyr")
library("ggplot2")

# build transport network
data_path <- system.file("extdata/poa", package = "r5r")
r5r_core <- setup_r5(data_path)

# load origin/destination points
points <- read.csv(file.path(data_path, "poa_points_of_interest.csv"))

# load fare structure object
fare_structure_path <- system.file(
  "extdata/poa/fares/fares_poa.zip",
  package = "r5r"
)
fare_structure <- read_fare_structure(fare_structure_path)

# inputs
departure_datetime <- as.POSIXct("13-05-2019 14:00:00",
                                 format = "%d-%m-%Y %H:%M:%S")
# plan itineraries
dit <- detailed_itineraries(r5r_core,
                            origins = points[10,],
                            destinations = points[12,],
                            mode = c("WALK", "TRANSIT"),
                            departure_datetime = departure_datetime,
                            max_trip_duration = 140,
                            fare_structure = fare_structure,
                            max_fare = 9,
                            shortest_path = FALSE,
                            drop_geometry = TRUE)


# build Pareto frontiers
pareto_df <- dit  |>
  group_by(option) |>
  summarise(travel_time = max(total_duration),
            fare = max(total_fare),
            mode = paste(if_else(mode == "WALK", "", mode), collapse = " ")) |>
  mutate(mode = if_else(mode == "", "WALK", mode))

# plot Pareto frontiers
ggplot(pareto_df , aes(x=fare, y=travel_time)) +
  geom_step() +
  geom_point() +
  geom_text(aes(label=mode), hjust = -0.1, vjust=-0.5) +
  geom_text(aes(label=paste("BRL", scales::comma(fare, accuracy = 0.01))),
            hjust = 0.3, vjust=1.5) +
  scale_y_continuous(breaks = seq(0, 140, 20), limits = c(0, 140)) +
  scale_x_continuous(breaks = 0:10, limits = c(0, 10),
                     labels = scales::comma_format(accuracy = 0.01)) +
  theme_light() +
  labs(x = "fare (BRL)", y = "travel time (minutes)",
       subtitle = "Travel time and monetary cost of trips between\nFarrapos Station and Praia de Belas Shopping Mall")

pareto_dit

@mvpsaraiva
Copy link
Collaborator Author

Finally, if you don't care about a lot of path alternatives or monetary costs, but you need the detailed shortest itinerary between two points (with geometries, travel distances, routes and so on...), the new detailed_itineraries() also has you covered: it's much faster!

See below the processing times for up to 40,000 queries: the new version is 4 to 5 times faster, on average!

153070fd-f191-4331-8a4d-7e782c439d07

It's also much more memory efficient: it uses 5GB of RAM to compute 40,000 queries, while before it used 20GB of RAM to do the same.

I think that's a wrap for it :)

@mvpsaraiva
Copy link
Collaborator Author

A final note before closing this issue: even though detailed_itineraries() is faster now, it still runs queries on a one-to-one basis and is orders of magnitude slower than both travel_time_matrix() and expanded_travel_time_matrix(), which compute those 40,000 queries in just a few milliseconds.

@rafapereirabr
Copy link
Member

Marcus, just checking the performance comparison between CRAN version and dev version. In this simple reprex below, I get that the currenct version of the funcion on CRAN is faster then the dev version. Am I missing something here?

library(r5r)

# build transport network
data_path <- system.file("extdata/poa", package = "r5r")
r5r_core <- setup_r5(data_path, overwrite = T)

# load origin/destination points
points <- read.csv(file.path(data_path, "poa_hexgrid.csv"))

# inputs
departure_datetime <- as.POSIXct(
 "13-05-2019 14:00:00",
 format = "%d-%m-%Y %H:%M:%S"
)

system.time(
det <- detailed_itineraries(
 r5r_core,
 origins = points,
 destinations = points, 
 mode = "TRANSIT",
 departure_datetime = departure_datetime,
 max_trip_duration = 60 , 
 shortest_path = T,
 drop_geometry = T
 )
)
  • CRAN v0.7.1: ~9 seconds
  • dev version 0.7.90000: ~25 seconds

@mvpsaraiva
Copy link
Collaborator Author

You're using the same points for origins and destinations, which is a scenario that I didn't test. Perhaps there's some optimisation to do there.

I've made a small change to your code and got some better results:

system.time(
det <- detailed_itineraries(
 r5r_core,
 origins = points[1:1227,],
 destinations = points[1227:1,], 
 mode = "TRANSIT",
 departure_datetime = departure_datetime,
 max_trip_duration = 60 , 
 shortest_path = T,
 drop_geometry = T
 )
)
  • CRAN v0.7.1: ~ 50 seconds
  • dev version: ~ 11 seconds

@rafapereirabr
Copy link
Member

ah! indeed. Here are the results that I got.

  • CRAN v0.7.1: ~ 182 seconds
  • dev version: ~ 28 seconds

thanks again, Marcus!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants