Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ab_scenario does not work with leeds_ data #34

Closed
natesheehan opened this issue Jul 9, 2021 · 9 comments
Closed

ab_scenario does not work with leeds_ data #34

natesheehan opened this issue Jul 9, 2021 · 9 comments

Comments

@natesheehan
Copy link
Collaborator

This issue is effecting cyipt/actdev#177 to rebuild abstreet json file from actdev.

It can be reporduced by the example method :

library(abstr)
ab_scenario_list = ab_scenario(
 leeds_houses,
 leeds_buildings,
 leeds_desire_lines,
 leeds_zones,
 output_format = "json_list"
)
#> Error in `[.data.frame`(od, c(names(od)[1:2], modes)): undefined columns selected
Created on 2021-07-09 by the reprex package
@Robinlovelace
Copy link
Collaborator

Is this with the latest version of od that can be installed with:

remotes::install_github("itsleeds/od")

Updating packages to try to reproduce it now...

@Robinlovelace
Copy link
Collaborator

Confirmed: I can reproduce this.

remotes::install_github("itsleeds/od")
#> Using github PAT from envvar GITHUB_PAT
#> Skipping install of 'od' from a github remote, the SHA1 (094eae68) has not changed since last install.
#>   Use `force = TRUE` to force installation
remotes::install_github("a-b-street/abstr")
#> Using github PAT from envvar GITHUB_PAT
#> Skipping install of 'abstr' from a github remote, the SHA1 (a3744773) has not changed since last install.
#>   Use `force = TRUE` to force installation
library(abstr)
library(abstr)
ab_scenario_list = ab_scenario(
  leeds_houses,
  leeds_buildings,
  leeds_desire_lines,
  leeds_zones,
  output_format = "json_list"
)
#> Error in `[.data.frame`(od, c(names(od)[1:2], modes)): undefined columns selected

Created on 2021-07-09 by the reprex package (v2.0.0)

@Robinlovelace
Copy link
Collaborator

Robinlovelace commented Jul 9, 2021

Guess it's because the modes column no longer exists or was renamed. I recall discussing with this with @tnederlof...

@Robinlovelace
Copy link
Collaborator

Immediate cause of error is a tightening of the required input data based on this:

modes = c("Walk", "Bike", "Drive", "Transit"),

There are no columns with those names in the leeds_desire dataset. First thing: make the function give a better error message when those names are not present in the data.

@Robinlovelace
Copy link
Collaborator

Robinlovelace commented Jul 9, 2021

After the commit above the same example emits this message:

Column names expected: Walk, Bike, Drive, Transit
Column names in od object: osm_way_id, building
Error in ab_scenario(leeds_houses, leeds_buildings, leeds_desire_lines,  : 
  Column names do not modes, try renaming columns.

Robinlovelace added a commit that referenced this issue Jul 9, 2021
@Robinlovelace Robinlovelace changed the title ab_scenario throwing error "Error in [.data.frame(od, c(names(od)[1:2], modes)) : undefined columns selected" ab_scenario does not work with leeds_ data Jul 10, 2021
@Robinlovelace
Copy link
Collaborator

Good progress on this. At this point I don't think it's an issue with the ab_scenario() function, which did need some tweaking, but the data, which does not adhere to A/B Street mode names. Essentially there is a need for pre-processing of the input data so it better matches what A/B Street and hence abstr expects. Pathways towards are a solution are shown in the reprex below which assumes you're running the code inside the root directory of this repo.

# Exploring ab_scenario with leeds_ data

setwd("~/orgs/a-b-street/abstr")
devtools::load_all()
#> ℹ Loading abstr
#> Warning in fun(libname, pkgname): rgeos: versions of GEOS runtime 3.9.0-CAPI-1.16.2
#> and GEOS at installation 3.8.0-CAPI-1.13.1differ

args(name = ab_scenario)
#> function (od, zones, zones_d = NULL, origin_buildings = NULL, 
#>     destination_buildings = NULL, pop_var = 3, time_fun = ab_time_normal, 
#>     output = "sf", modes = c("Walk", "Bike", "Transit", "Drive"), 
#>     ...) 
#> NULL
# assign objects required by the function
od = leeds_desire_lines
zones = leeds_zones
origin_buildings = leeds_houses
destination_buildings = leeds_buildings
modes = c("Walk", "Bike", "Transit", "Drive")


if(methods::is(od, class2 = "sf")) {
  od = sf::st_drop_geometry(od)
}
if(!any(modes %in% names(od))) {
  message("Column names, at least on of: ", paste0(modes, collapse = ", "))
  message("Column names in od object: ", paste0(names(od), collapse = ", "))
  stop("Column names in od data do not match modes. Try renaming od columns")
}
#> Column names, at least on of: Walk, Bike, Transit, Drive
#> Column names in od object: geo_code1, geo_code2, all_base, walk_base, cycle_base, drive_base, length, walk_godutch, cycle_godutch, drive_godutch
#> Error in eval(expr, envir, enclos): Column names in od data do not match modes. Try renaming od columns

# we need to rename the modes columns
names(od)
#>  [1] "geo_code1"     "geo_code2"     "all_base"      "walk_base"    
#>  [5] "cycle_base"    "drive_base"    "length"        "walk_godutch" 
#>  [9] "cycle_godutch" "drive_godutch"
names(od)[4:6] = c("Walk", "Bike", "Drive")

# minimise n. columns:
modes_in_od = modes[modes %in% names(od)]
modes_in_od
#> [1] "Walk"  "Bike"  "Drive"
od = od[c(names(od)[1:2], modes_in_od)]
od_long = tidyr::pivot_longer(od, cols = modes_in_od, names_to = "mode")
#> Note: Using an external vector in selections is ambiguous.
#> ℹ Use `all_of(modes_in_od)` instead of `modes_in_od` to silence this message.
#> ℹ See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
#> This message is displayed once per session.
repeat_indices = rep(seq(nrow(od_long)), od_long$value)
od_longer = od_long[repeat_indices, 1:3]
# summary(od_longer$geo_code1 %in% zones$geo_code)
if(!is.null(origin_buildings)) {
  suppressMessages({
    origin_buildings = sf::st_centroid(origin_buildings)
  })
}
#> Warning in st_centroid.sf(origin_buildings): st_centroid assumes attributes are
#> constant over geometries of x
if(!is.null(destination_buildings)) {
  suppressMessages({
    destination_buildings = sf::st_centroid(destination_buildings)
  })
}
#> Warning in st_centroid.sf(destination_buildings): st_centroid assumes attributes
#> are constant over geometries of x

zones[origin_buildings, ]
#> Simple feature collection with 2 features and 10 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -1.539693 ymin: 53.75769 xmax: -1.491664 ymax: 53.79866
#> Geodetic CRS:  WGS 84
#> # A tibble: 2 x 11
#>   geo_code  geo_name  lad11cd   lad_name   all bicycle  foot car_driver
#>   <chr>     <chr>     <chr>     <chr>    <dbl>   <dbl> <dbl>      <dbl>
#> 1 E02002404 Leeds 075 E08000035 Leeds     4121      72  1427       1356
#> 2 E02006876 Leeds 112 E08000035 Leeds     3636      74  1351       1198
#> # … with 3 more variables: car_passenger <dbl>, motorbike <dbl>,
#> #   geometry <POLYGON [°]>
zones$geo_code = gsub(pattern = "E02002404", replacement = "lcid", x = zones$geo_code)

res = od::od_jitter(
  od = od_longer,
  z = zones,
  subpoints = origin_buildings,
  subpoints_d = destination_buildings
)
#> 0 origins with no match in zone ids
#> 0 destinations with no match in zone ids
#>  points not in od data removed.
#> Linking to GEOS 3.9.0, GDAL 3.2.1, PROJ 7.2.1

plot(res)

Created on 2021-07-10 by the reprex package (v2.0.0)

@Robinlovelace
Copy link
Collaborator

The only missing component from this is the destination zone which may need changes in ab_scenario(). Will cobble together another reprex to test it but the key thing is that other than appearing here

zones_d = NULL,

There is no zones_d in the function. A minor tweak is therefore needed to enable this object to be used. Plan to work on this now and report progress here.

@Robinlovelace
Copy link
Collaborator

Good news, it should be an easy fix as highlighted by this.

res = od::od_jitter(
  od = od_longer,
  z = zones,
  zd = zones_d, 
  subpoints = origin_buildings,
  subpoints_d = destination_buildings
)
#> 0 origins with no match in zone ids
#> 0 destinations with no match in zone ids
#>  points not in od data removed.

plot(res)

Full reprex in details.

# Exploring ab_scenario with leeds_ data

remotes::install_github("a-b-street/abstr")
#> Skipping install of 'abstr' from a github remote, the SHA1 (056a7ea1) has not changed since last install.
#>   Use `force = TRUE` to force installation

setwd("~/orgs/a-b-street/abstr")
devtools::load_all()
#> ℹ Loading abstr

args(name = ab_scenario)
#> function (od, zones, zones_d = NULL, origin_buildings = NULL, 
#>     destination_buildings = NULL, pop_var = 3, time_fun = ab_time_normal, 
#>     output = "sf", modes = c("Walk", "Bike", "Transit", "Drive"), 
#>     ...) 
#> NULL
# assign objects required by the function
od = leeds_desire_lines
zones = leeds_zones
origin_buildings = leeds_houses
destination_buildings = leeds_buildings
modes = c("Walk", "Bike", "Transit", "Drive")


if(methods::is(od, class2 = "sf")) {
  od = sf::st_drop_geometry(od)
}
if(!any(modes %in% names(od))) {
  message("Column names, at least on of: ", paste0(modes, collapse = ", "))
  message("Column names in od object: ", paste0(names(od), collapse = ", "))
  stop("Column names in od data do not match modes. Try renaming od columns")
}
#> Column names, at least on of: Walk, Bike, Transit, Drive
#> Column names in od object: geo_code1, geo_code2, all_base, walk_base, cycle_base, drive_base, length, walk_godutch, cycle_godutch, drive_godutch
#> Error in eval(expr, envir, enclos): Column names in od data do not match modes. Try renaming od columns

# we need to rename the modes columns
names(od)
#>  [1] "geo_code1"     "geo_code2"     "all_base"      "walk_base"    
#>  [5] "cycle_base"    "drive_base"    "length"        "walk_godutch" 
#>  [9] "cycle_godutch" "drive_godutch"
names(od)[4:6] = c("Walk", "Bike", "Drive")

# minimise n. columns:
modes_in_od = modes[modes %in% names(od)]
modes_in_od
#> [1] "Walk"  "Bike"  "Drive"
od = od[c(names(od)[1:2], modes_in_od)]
od_long = tidyr::pivot_longer(od, cols = modes_in_od, names_to = "mode")
#> Note: Using an external vector in selections is ambiguous.
#> ℹ Use `all_of(modes_in_od)` instead of `modes_in_od` to silence this message.
#> ℹ See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
#> This message is displayed once per session.
repeat_indices = rep(seq(nrow(od_long)), od_long$value)
od_longer = od_long[repeat_indices, 1:3]
# summary(od_longer$geo_code1 %in% zones$geo_code)
if(!is.null(origin_buildings)) {
  suppressMessages({
    origin_buildings = sf::st_centroid(origin_buildings)
  })
}
#> Warning in st_centroid.sf(origin_buildings): st_centroid assumes attributes are
#> constant over geometries of x
if(!is.null(destination_buildings)) {
  suppressMessages({
    destination_buildings = sf::st_centroid(destination_buildings)
  })
}
#> Warning in st_centroid.sf(destination_buildings): st_centroid assumes attributes
#> are constant over geometries of x

zones[origin_buildings, ]
#> Simple feature collection with 2 features and 10 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -1.539693 ymin: 53.75769 xmax: -1.491664 ymax: 53.79866
#> Geodetic CRS:  WGS 84
#> # A tibble: 2 x 11
#>   geo_code  geo_name  lad11cd   lad_name   all bicycle  foot car_driver
#>   <chr>     <chr>     <chr>     <chr>    <dbl>   <dbl> <dbl>      <dbl>
#> 1 E02002404 Leeds 075 E08000035 Leeds     4121      72  1427       1356
#> 2 E02006876 Leeds 112 E08000035 Leeds     3636      74  1351       1198
#> # … with 3 more variables: car_passenger <dbl>, motorbike <dbl>,
#> #   geometry <POLYGON [°]>
zones$geo_code = gsub(pattern = "E02002404", replacement = "lcid", x = zones$geo_code)

res = od::od_jitter(
  od = od_longer,
  z = zones,
  subpoints = origin_buildings,
  subpoints_d = destination_buildings
)
#> 0 origins with no match in zone ids
#> 0 destinations with no match in zone ids
#>  points not in od data removed.
#> Linking to GEOS 3.9.0, GDAL 3.2.1, PROJ 7.2.1

plot(res)

# using zones_d
u = "https://github.com/cyipt/actdev/raw/main/data-small/lcid/site.geojson"
zones_d = sf::read_sf(u)

res = od::od_jitter(
  od = od_longer,
  z = zones,
  zd = zones_d, 
  subpoints = origin_buildings,
  subpoints_d = destination_buildings
)
#> 0 origins with no match in zone ids
#> 0 destinations with no match in zone ids
#>  points not in od data removed.

plot(res)

Created on 2021-07-10 by the reprex package (v2.0.0)

@Robinlovelace
Copy link
Collaborator

Heads-up @natesheehan this now works as documented here: https://a-b-street.github.io/abstr/reference/ab_scenario.html

Gratuitous screenshot below.

image

We will need to change code in the ActDev codebase to support these changes. Good news: the necessary changes are minor and will look similar to the changes in the example, along the lines of:

od = leeds_desire_lines
names(od)[4:6] = c("Walk", "Bike", "Drive")
ablines = ab_scenario(
  od = od,
  zones = leeds_site_area,
  zones_d = leeds_zones,
  origin_buildings = leeds_houses,
  destination_buildings = leeds_buildings,
  output = "sf"
)

Setting the Dutch names in the same way can generate Dutch scenarios, e.g.

od = leeds_desire_lines
names(od)[8:10] = c("Walk", "Bike", "Drive")
ablines_dutch = ab_scenario(
  od = od,
  zones = leeds_site_area,
  zones_d = leeds_zones,
  origin_buildings = leeds_houses,
  destination_buildings = leeds_buildings,
  output = "sf"
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants