Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strange behavior of function example transit_network_to_sf() #192

Closed
rafapereirabr opened this issue Aug 11, 2021 · 9 comments
Closed

strange behavior of function example transit_network_to_sf() #192

rafapereirabr opened this issue Aug 11, 2021 · 9 comments
Assignees
Labels

Comments

@rafapereirabr
Copy link
Member

The example presented in the function transit_network_to_sf() return some stops with wrong spatial coordinates.

library(r5r)
library(sf)
library(mapview)
library(sfheaders)

# build transport network
path <- system.file("extdata/poa", package = "r5r")
r5r_core <- setup_r5(data_path = path)

# extract transit network from r5r_core
transit_net <- transit_network_to_sf(r5r_core)

# check on the map
mapview(transit_net$stops) 

# check coordinates
a <- sfheaders::sf_to_df(transit_net$stops)
a <- a[order(a$x),]
tail(a)

>      sfg_id point_id  x  y
> 3756   3756     3756 -1 -1
> 3856   3856     3856 -1 -1
> 3864   3864     3864 -1 -1
> 3875   3875     3875 -1 -1
> 3885   3885     3885 -1 -1
> 3895   3895     3895 -1 -1
@mvpsaraiva
Copy link
Collaborator

Those are the stops that R5 couldn't link to the street network, because the sample gtfs covers an area larger than the sample osm.pbf. If you check the transit_net$stops sf, you can see that there's a linked_to_street column which is set to FALSE in those cases, and their lat lon coordinates are -1.

I think I forgot to talk to you about this, to find a solution. We can just drop those observations, but I didn't do it yet because it can cause some problems in the future if we decide to include stop_times in transit_network_to_sf ().

@rafapereirabr
Copy link
Member Author

In this case, we agreed we should keep all observations in the output.

@dhersz will impute NA spatial coordinates to 'problematic' stops, and I'll clarify this in the documentation.

@dhersz dhersz self-assigned this Aug 19, 2021
@rafapereirabr rafapereirabr self-assigned this Aug 19, 2021
@rafapereirabr
Copy link
Member Author

I've also noticed that the function returns some duplicated routes. See:

library(r5r)

# build transport network
path <- system.file("extdata/poa", package = "r5r")
r5r_core <- setup_r5(data_path = path)

# extract transit network from r5r_core
transit_net <- transit_network_to_sf(r5r_core)

r <- transit_net$routes
head( as.data.frame(r) )

> agency_id                                   agency_name route_id          long_name   short_name
> 1      EPTC Empresa Publica de Transportes e Circulação     1112  HIPICA / TRISTEZA         1112
> 2      EPTC Empresa Publica de Transportes e Circulação      149             ICARAI          149
> 3      EPTC Empresa Publica de Transportes e Circulação      149             ICARAI          149
> 4      EPTC Empresa Publica de Transportes e Circulação      165              COHAB          165
> 5      EPTC Empresa Publica de Transportes e Circulação      165              COHAB          165

@mvpsaraiva
Copy link
Collaborator

Those are actually routes with the same id but different shapes:

library(r5r)

# build transport network
path <- system.file("extdata/poa", package = "r5r")
r5r_core <- setup_r5(data_path = path)

# extract transit network from r5r_core
transit_net <- transit_network_to_sf(r5r_core)

r <- subset(transit$routes, short_name == 149)
r$id <- c(1, 2)
mapview(r, zcol = "id")

Screenshot 2021-08-22 at 10 57 42

But R5 drops the shape_id field, so maybe add another field to differentiate those shapes.

@rafapereirabr
Copy link
Member Author

Oh, I see. In this case it would indeed make sense to include the shape_id column.

@rafapereirabr
Copy link
Member Author

@mvpsaraiva is it possible to include the shape_id column in the route output?

@rafapereirabr
Copy link
Member Author

will impute NA spatial coordinates to 'problematic' stops, and I'll clarify this in the documentation.

I have just update the code to impute NA spatial coordinates to 'problematic' stops.

@mvpsaraiva
Copy link
Collaborator

mvpsaraiva commented Aug 25, 2021

@mvpsaraiva is it possible to include the shape_id column in the route output?

Actually no, because R5 drops the shape_id information. I think the best we can do is to create a "mock" shape_id filled with sequential numbers just to differentiate the outputs, but the user won't be able to link it to the original shape_id from the GTFS.

And I think it's easier to do this in R, via data.table.

@rafapereirabr
Copy link
Member Author

rafapereirabr commented Aug 25, 2021

Oh, Ok. In this case, I think it would be best to leave it as it is. Creating a "mock" shape_id would probably make users confused about it.

I've added some info in the documentation to keep a record of this.

#' @return A list with two components of a transit network in sf format:
#'         route shapes (LINESTRING) and transit stops (POINT). The same
#'         `route_id`/`short_name` might appear with different geometries. This occurs when
#'         a route has two different shape_ids. Some transit stops might be returned
#'         with geometry `POINT EMPTY` (i.e. missing `NA` spatial coordinates).
#'         This may occur when a transit stop is not snapped to the road network,
#'         possibly because the `gtfs.zip` input data covers an area larger than
#'         the `osm.pbf` input data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants