Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix ARC #81

Closed
5 tasks
mdsumner opened this issue Nov 1, 2018 · 14 comments
Closed
5 tasks

fix ARC #81

mdsumner opened this issue Nov 1, 2018 · 14 comments
Milestone

Comments

@mdsumner
Copy link
Member

mdsumner commented Nov 1, 2018

ARC only works properly for polygons, or line sets that actually join as a ring.

Need multiple overlaps handle properly as well.

  • install example data
  • ensure sc_node works for SC
  • ensure sc_arc works for SC
  • (consider) reimpliment the PATH methods from SC
  • clarify the the silicate-view of this
library(stplanr)
library(sf)
x <- routes_fast_sf[as.numeric(st_length(x)) > 0, "ID"]
plot(x, lwd = nrow(x):1)  ## lots of complicated overlap

# 2975 instances
nrow(st_coordinates(x))
library(silicate)
sc <- SC(x)
# 830 unique vertices, 858 unique edges
#class       : SC
#type        : Primitive
#vertices    : 830 (2-space)
#primitives  : 858 (1-space)
#crs         : +proj=longlat +datum=WGS84 +no_defs

# 2926 instances of edges
sc$object_link_edge

library(dplyr)
sc$object_link_edge %>% group_by(edge_) %>% summarize(nlaps = n()) %>% group_by(nlaps) %>% tally()

## between 1 and 14 instances (no 9 or 13)
# A tibble: 12 x 2
#     nlaps     n
#     <int> <int>
# 1     1   299
# 2     2   156
# 3     3    47
# 4     4   150
# 5     5    25
# 6     6    78
# 7     7     8
# 8     8    41
# 9    10    23
# 10   11    15
# 11   12    11
# 12   14     5
@mpadge
Copy link
Member

mpadge commented Nov 2, 2018

With @Robinlovelace's stplanr issue and the osmdata_sc() function we've got plenty of fodder to aid resolution of this . I guess this means I should start impelementing osmdata_sc (x, sc_verb = "ARC")? (see this issue

@mdsumner
Copy link
Member Author

mdsumner commented Nov 4, 2018

The crux thing that's required in my opinion is a "path finder" from arbitrary edge sets. If we SC a data set, the nodes are then the vertices with "3 or more edge-connections", i.e.

sc_node.SC <- function(x, ...) {
  alledge <- tibble::tibble(v = c(x$edge$.vx0, x$edge$.vx1),
                            e = c(x$edge$edge_, x$edge$edge_))
  v0 <- alledge %>% group_by(v) %>% tally() %>% filter(n > 2)
  tibble::tibble(vertex_ = sort(unique(v0$v)))
}

and from that basis you then identify the paths that link between those nodes. ARC() is already doing this properly for polygon layers, but from the SC model we really need those paths to be reconstructed from raw edges. All the rest is joins and so forth.

@mpadge
Copy link
Member

mpadge commented Nov 4, 2018

That's the dodgr graph contraction algorithm, the complication there being directed edges, which mean n >2 only for one-way edges, but either n>3 or n>4 for other cases. It's more complicated than I had thought before we started implementing it.

Edit: oh, and 2 only works like that when directions are opposing. So the bigger discussion I'd like to have is about compatability of directed graphs with SC in general

@mdsumner
Copy link
Member Author

mdsumner commented Nov 4, 2018

Ah indeed indeed, !n == 2.

I feel that directed stuff can come out later - I get mighty confused about it, but I think the way I do it in SC now is helpful. You have to normalize all edges (remove duplicated instances) and you do that by parallel sort, but also record each instance of the edge and what its orientation was - either oriented as the edge is now (.vx0, .vx1) native_ = TRUE or not (.vx1, .vx0) native_ = FALSE. So in a polygon layer there's either 1 or 2 instances, and if two you have one native_ and one not native_. In a complexly layered data set (like routes_fast_sf) there are just more instances - and some are native orientation, some aren't - or they are explicitly orientated by values stored on them, but there's a "royal" instance on the edge table, and maybe there should be a rule about what the royal orientation is (but then the geometry can change arbitrarily, and orientation can be re-inferred anyway so not sure that's important).

Then directed edges guide the question of which instances you keep, and what values you store with them - but the graph itself is agnostic (or arbitrary) to any particular orientation.

(SF can't do this at all, and geometry algorithms return in some y then x, or plane-sweep order to the frustration of many who want a line to run a particular way)

@mdsumner
Copy link
Member Author

mdsumner commented Nov 4, 2018

"edge_id" in dodgr_contract_graph identifies instances, yeah?

@mdsumner
Copy link
Member Author

mdsumner commented Nov 4, 2018

Here's a rough stab, I understand dodgr better now!

library(dodgr)

to_dodgr <- function(x, ...) {
  UseMethod("to_dodgr")
}
to_dodgr.SC <- function(x, ...) {
  x$object_link_edge %>% dplyr::inner_join(x$edge) %>% 
    dplyr::transmute(from_id = ifelse(native_, .vx0, .vx1), 
                     to_id = ifelse(native_, .vx1, .vx0), d = 1, 
                     edge_id = row_number(), edge_orig = edge_)
}
sc <- SC(routes)

instances <- to_dodgr(sc)
cg <- dodgr_contract_graph(instances)


als <- cg$edge_map  %>%  split(.$edge_new) %>% purrr::map(~ 
                          dplyr::inner_join(.x, instances %>% mutate(edge_id = as.character(edge_id)), c("edge_old" = "edge_id")) 
                                                  )
library(dplyr)
expand_verts <- function(a) {
  a %>% inner_join(sc$vertex, c("from_id" = "vertex_")) %>% dplyr::rename(x0 = x_, y0 = y_) %>% 
  inner_join(sc$vertex, c("to_id" = "vertex_"))
}
plot(sc)
for (i in seq_along(als)) {
  tab <- expand_verts(als[[i]])  
  segments(tab$x0, tab$y0, tab$x_, tab$y_, col = sample(rainbow(length(als)), 1), lwd = 4)
}

points(sc_node(sc) %>% inner_join(sc$vertex) %>% select(x_, y_))

It's not getting every arc, and other scary issues:

  • I might be in trouble using integer IDs, where dodgr is converting to character (see my casts above)
  • don't give dodgr_contract_graph the edge table, it crashes (i.e. from_id = .vx0, to_id = .vx1 - that probably needs catching)
  • I lost sight of the fact that my edge_id is an instance id, so there's a join back to the unique edge, it works it's just confusing with all the names
  • I haven't checked anything except via this plot

image

@mpadge
Copy link
Member

mpadge commented Nov 4, 2018

Oh that's all grand, but just one clarification, if I may... You mean (.vx0,.vx1, _native=T) == (.vx1,.vx0, _native=F) and (.vx0,.vx1, _native=F) == (.vx1,.vx0, _native=T), but not the other way round? That would indeed allow immediate expression of dodgr-type structures in these terms. Coolio!

Edit: can we use _dir instead of _native? I've just got a general aversion to the latter term. Or do you envision it being more than just _dir? Do or will other tables have comparable columns? And if so, what might be some use cases for such?

@mdsumner
Copy link
Member Author

mdsumner commented Nov 4, 2018

I don't totally get your "You mean ..." sentence.

When SC0 is run, its topology_ column has .vx0 and .vx which are the native order of edge pairs from the input data (sf polygons or lines). The vertices are unique, but these edge pairs obviously are not - every object has its own copy of any particular vertex-pair index.

When SC is run, those pairs are also made unique - by using pmin/pmax https://github.com/hypertidy/silicate/blob/master/R/SC-model.R#L50-L51 - and so we can remove repeated instances (they are all "undirected" in one sense). But the object_link_edge table records those original instances and whether the native index in edge matches the orientation they came in with.

To me it's a way of recording the way the data came up, so it's reconstructable up to this point. As we filter and modify edges obviously we drift away from the original data. I'm not sure yet how traditional directed/undirected graphs should be seen in this way. Does that make sense?

@mdsumner
Copy link
Member Author

mdsumner commented Nov 5, 2018

Bingo!

library(dodgr)

to_dodgr <- function(x, ...) {
  UseMethod("to_dodgr")
}
# to_dodgr.SC <- function(x, ...) {
#   x$object_link_edge %>% dplyr::inner_join(x$edge) %>% 
#     dplyr::transmute(from_id = ifelse(native_, .vx0, .vx1), 
#                      to_id = ifelse(native_, .vx1, .vx0), d = 1, 
#                      edge_id = row_number(), edge_orig = edge_)
# }


# to_dodgr.SC <- function(x, ...) {
#   dplyr::bind_rows(x$object_link_edge %>% dplyr::inner_join(x$edge) %>% 
#     dplyr::transmute(from_id = .vx0,  to_id = .vx1), 
#     x$object_link_edge %>% dplyr::inner_join(x$edge) %>% 
#       dplyr::transmute(from_id = .vx1, 
#                        to_id = .vx0)
#   ) %>%  dplyr::mutate(d = 1, edge_id = row_number())
# }
to_dodgr.SC <- function(x, ...) {
  edge <- x$edge %>% transmute(from_id = .vx0, to_id = .vx1, edge_)
  edge2 <- tibble(from_id = edge$to_id, to_id  = edge$from_id, edge_ = edge$edge_)
  bind_rows(edge, edge2) %>% mutate(edge_id = row_number(), d = 1)
}

sc <- SC(routes)

instances <- to_dodgr(sc)
cg <- dodgr_contract_graph(instances)


als <- cg$edge_map  %>%  split(.$edge_new) %>% purrr::map(~ 
                                                            dplyr::inner_join(.x, instances %>% mutate(edge_id = as.character(edge_id)), c("edge_old" = "edge_id")) 
)
library(dplyr)
expand_verts <- function(a) {
  a %>% inner_join(sc$vertex, c("from_id" = "vertex_")) %>% dplyr::rename(x0 = x_, y0 = y_) %>% 
    inner_join(sc$vertex, c("to_id" = "vertex_"))
}
#plot(sc$vertex[1:2])
plot(sc)
cols <- viridis::viridis(length(als))
for (i in seq_along(als)) {
  tab <- expand_verts(als[[i]])  
  segments(tab$x0, tab$y0, tab$x_, tab$y_, col = sample(cols, 1), lwd = 4)
}


points(sc_node(sc) %>% inner_join(sc$vertex) %>% select(x_, y_))

It's a mess, but I have to leave this for now - just had to try that! (I put a 'routes' object in silicate, a filtered copy of routes_fast_sf - no 0-length lines).

image

@Robinlovelace
Copy link
Contributor

I think it's fair to say the routes dataset has been shredded in ways its creators never conceived of. Great work by the looks of it. And heads-up @richardellison. Guess you'll be interested in this.

@mdsumner
Copy link
Member Author

mdsumner commented Dec 11, 2018

This is a simpler way to identify shared vertices from terminal vertices. (Current ARC is failing to find terminal vertices, as it was premised around polygons. )

library(sf)
sm_routes <- sf::st_as_sf(rmapshaper::ms_simplify(as(routes, "Spatial")))
sfx <- sm_routes[c(25, 27), ]
sfx <- minimal_mesh
sc <- SC0(sfx)
#sc <- SC0(silicate:::minimal_line)
## on creation, SC0 has all edge instances in the order they occur
sc$object$object_ <- seq_len(nrow(sc$object))
edge <- tidyr::unnest(sc$object[c("object_", "topology_")])


## nodes that occur twice or more
shared_verts <- edge$.vx0[which(!match(edge$.vx0, edge$.vx1) == (seq_len(nrow(edge)) - 1))]
## nodes that occur only once (for lines)
terminal_verts <- tibble::tibble(vertex_ = as.vector(t(as.matrix(edge[c(".vx0", ".vx1")])))) %>% 
  dplyr::count(.data$vertex_) %>% dplyr::filter(n == 1) %>% dplyr::pull(.data$vertex_)

plot(sfx, lwd = c(10, 2), reset = FALSE)
if (length(shared_verts) > 0) points(sc$vertex[shared_verts, c("x_", "y_")], col = "red", cex = 1.5)
if (length(terminal_verts) > 0) points(sc$vertex[terminal_verts, c("x_", "y_")], col = "green", cex = 1, pch = 19)

It works for various line overlaps and for polygons. We need to be able to also identify when a shared vertex is (only?) shared by an enclosing path, then that node-classification is enough to rebuild a sensible ARC/ARC0.

@mdsumner
Copy link
Member Author

ARC print is wrong

x <- structure(list(TRANSEG_ID = c(6084063L, 4296322L, 4299053L, 6169126L, 
6084048L, 6168802L, 6169130L), TRANS_TYPE = c("Road", "Road", 
"Road", "Road", "Road", "Road", "Road"), TSEG_FEAT = c("Normal Transport Segment", 
"Intersection Segment", "Normal Transport Segment", "Normal Transport Segment", 
"Normal Transport Segment", "Intersection Segment", "Dual Carriageway"
), SHAPE = structure(list(structure(list(structure(c(525974.353, 
525941.38, 5251023.617, 5251011.371), .Dim = c(2L, 2L))), class = c("XY", 
"MULTILINESTRING", "sfg")), structure(list(structure(c(525974.353, 
525978.84, 525984.763, 525991.963, 525997.328, 525998.134, 5251023.617, 
5251023.215, 5251020.462, 5251013.262, 5251005.962, 5251004.866
), .Dim = c(6L, 2L))), class = c("XY", "MULTILINESTRING", "sfg"
)), structure(list(structure(c(526019.274, 525974.353, 5251050.548, 
5251023.617), .Dim = c(2L, 2L))), class = c("XY", "MULTILINESTRING", 
"sfg")), structure(list(structure(c(525974.353, 525969.462, 525963.162, 
525960.884, 5251023.617, 5251033.762, 5251044.663, 5251048.042
), .Dim = c(4L, 2L))), class = c("XY", "MULTILINESTRING", "sfg"
)), structure(list(structure(c(525960.884, 525957.162, 525929.042, 
5251048.042, 5251053.563, 5251090.63), .Dim = 3:2)), class = c("XY", 
"MULTILINESTRING", "sfg")), structure(list(structure(c(525941.38, 
525949.487, 525956.543, 525959.671, 525961.378, 525961.803, 525961.089, 
525960.884, 5251011.371, 5251018.537, 5251024.258, 5251028.506, 
5251032.895, 5251037.627, 5251045.204, 5251048.042), .Dim = c(8L, 
2L))), class = c("XY", "MULTILINESTRING", "sfg")), structure(list(
    structure(c(525984.056, 525982.162, 525978.962, 525974.862, 
    525974.353, 5250981.35, 5250996.662, 5251010.562, 5251022.562, 
    5251023.617), .Dim = c(5L, 2L))), class = c("XY", "MULTILINESTRING", 
"sfg"))), class = c("sfc_MULTILINESTRING", "sfc"), precision = 0, bbox = structure(c(xmin = 525929.042, 
ymin = 5250981.35, xmax = 526019.274, ymax = 5251090.63), class = "bbox"), crs = structure(list(
    epsg = 28355L, proj4string = "+proj=utm +zone=55 +south +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs"), class = "crs"), n_empty = 0L)), row.names = c(NA, 
-7L), sf_column = "SHAPE", agr = structure(c(TRANSEG_ID = NA_integer_, 
TRANS_TYPE = NA_integer_, TSEG_FEAT = NA_integer_), .Label = c("constant", 
"aggregate", "identity"), class = "factor"), class = c("sf", 
"tbl_df", "tbl", "data.frame"))

This should be 7 paths, 30 coordinates.

ARC(x)
class       : ARC
type        : Sequential
vertices    : 23 (2-space)
paths       : 4 (1-space)
coordinates : 22

@mdsumner
Copy link
Member Author

mdsumner commented Sep 7, 2019

Dump ARC except for the working polygon case

@mdsumner mdsumner mentioned this issue Sep 7, 2019
6 tasks
@mdsumner
Copy link
Member Author

mdsumner commented Sep 9, 2019

done, this needs a revisit one fine day

@mdsumner mdsumner closed this as completed Sep 9, 2019
@mdsumner mdsumner moved this from To do to Done in CRAN release 0.2.0 Oct 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

No branches or pull requests

3 participants