Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plotting order: geom_sf() vs geom_point() #4340

Closed
dholstius opened this issue Feb 3, 2021 · 3 comments · Fixed by #5170
Closed

Plotting order: geom_sf() vs geom_point() #4340

dholstius opened this issue Feb 3, 2021 · 3 comments · Fixed by #5170
Labels
feature a feature request or enhancement layers 📈

Comments

@dholstius
Copy link

I'm trying to figure out how to get geom_sf() to behave like geom_point() when it comes to plotting order.

I understand there's no longer an option to use aes(order = ...). Is there another way to prevent geom_sf() from re-ordering the points (rows) of its input, whilst still colouring by some attribute?

When all the points belonging to one class are plotted on top of all the others, it's hard to form an "unbiased" picture of the relative spatial distributions of the different classes. (In this reprex, they're nothing special, but with real data, the differences are meaningful.)

This seems like a bug, insofar as the behavior of geom_sf() should be consistent with that of geom_point(). But maybe I'm overlooking something? Thank you for your time and attention, and for this wonderfully useful package.

library(sf)
#> Warning: replacing previous import 'vctrs::data_frame' by 'tibble::data_frame'
#> when loading 'dplyr'
#> Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0
library(ggplot2)
library(tibble)

# This can be any polygon, really
nc <- sf::st_read(
  system.file("shape/nc.shp", package = "sf"),
  quiet = TRUE)

# Sample a lot of points inside the polygon
N <- 10000
sampled_points <- st_sample(
  st_transform(nc, 32119), # NC state plane, m
  size = N)

# Assign each point a random value: "Apple", "Banana", or "Cherry"
# (stored as attribute `fruit`)
labeled_points <- st_sf(
  geometry = sampled_points,
  fruit = sample(
    c("Apple", "Banana", "Cherry"),
    size = N,
    replace = TRUE))

# `geom_sf()` plots all blue points ("Cherry") on top,
# even though the rows of `labeled_points` are not sorted
# with respect to `fruit`.
ggplot(data = labeled_points) +
  geom_sf(aes(colour = fruit), size = I(3))

st_as_xy <- function (points) {
  tibble(
    as_tibble(st_coordinates(points)),
    st_drop_geometry(points))
}

# `geom_point()`, in contrast, seems to plot points in the
# order in which they appear in the data (as desired!).
ggplot(data = st_as_xy(labeled_points)) +
  geom_point(aes(X, Y, colour = fruit), size = I(3))

Created on 2021-02-03 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.6.2 (2019-12-12)
#>  os       macOS Catalina 10.15.7      
#>  system   x86_64, darwin15.6.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       America/Los_Angeles         
#>  date     2021-02-03                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date       lib source                            
#>  assertthat    0.2.1   2019-03-21 [1] CRAN (R 3.6.0)                    
#>  backports     1.1.9   2020-08-24 [1] CRAN (R 3.6.2)                    
#>  callr         3.4.3   2020-03-28 [1] CRAN (R 3.6.2)                    
#>  class         7.3-17  2020-04-26 [1] CRAN (R 3.6.2)                    
#>  classInt      0.4-3   2020-04-07 [1] CRAN (R 3.6.2)                    
#>  cli           2.0.2   2020-02-28 [1] CRAN (R 3.6.0)                    
#>  colorspace    1.4-1   2019-03-18 [1] CRAN (R 3.6.0)                    
#>  crayon        1.3.4   2017-09-16 [1] CRAN (R 3.6.0)                    
#>  curl          4.3     2019-12-02 [1] CRAN (R 3.6.0)                    
#>  DBI           1.1.0   2019-12-15 [1] CRAN (R 3.6.0)                    
#>  desc          1.2.0   2018-05-01 [1] CRAN (R 3.6.0)                    
#>  devtools      2.3.1   2020-07-21 [1] CRAN (R 3.6.2)                    
#>  digest        0.6.26  2020-10-17 [1] CRAN (R 3.6.2)                    
#>  dplyr         1.0.1   2020-07-31 [1] CRAN (R 3.6.2)                    
#>  e1071         1.7-3   2019-11-26 [1] CRAN (R 3.6.0)                    
#>  ellipsis      0.3.1   2020-05-15 [1] CRAN (R 3.6.2)                    
#>  evaluate      0.14    2019-05-28 [1] CRAN (R 3.6.0)                    
#>  fansi         0.4.1   2020-01-08 [1] CRAN (R 3.6.0)                    
#>  farver        2.0.3   2020-01-16 [1] CRAN (R 3.6.0)                    
#>  fs            1.4.2   2020-06-30 [1] CRAN (R 3.6.2)                    
#>  generics      0.0.2   2018-11-29 [1] CRAN (R 3.6.0)                    
#>  ggplot2     * 3.3.2   2020-06-19 [1] CRAN (R 3.6.2)                    
#>  glue          1.4.2   2020-08-27 [1] CRAN (R 3.6.2)                    
#>  gtable        0.3.0   2019-03-25 [1] CRAN (R 3.6.0)                    
#>  highr         0.8     2019-03-20 [1] CRAN (R 3.6.0)                    
#>  htmltools     0.5.0   2020-06-16 [1] CRAN (R 3.6.2)                    
#>  httr          1.4.2   2020-07-20 [1] CRAN (R 3.6.2)                    
#>  KernSmooth    2.23-17 2020-04-26 [1] CRAN (R 3.6.2)                    
#>  knitr         1.30    2020-09-22 [1] CRAN (R 3.6.2)                    
#>  labeling      0.3     2014-08-23 [1] CRAN (R 3.6.0)                    
#>  lifecycle     0.2.0   2020-03-06 [1] CRAN (R 3.6.0)                    
#>  magrittr      1.5     2014-11-22 [1] CRAN (R 3.6.0)                    
#>  memoise       1.1.0   2017-04-21 [1] CRAN (R 3.6.0)                    
#>  mime          0.9     2020-02-04 [1] CRAN (R 3.6.0)                    
#>  munsell       0.5.0   2018-06-12 [1] CRAN (R 3.6.0)                    
#>  pillar        1.4.6   2020-07-10 [1] CRAN (R 3.6.2)                    
#>  pkgbuild      1.1.0   2020-07-13 [1] CRAN (R 3.6.2)                    
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 3.6.0)                    
#>  pkgload       1.1.0   2020-05-29 [1] CRAN (R 3.6.2)                    
#>  prettyunits   1.1.1   2020-01-24 [1] CRAN (R 3.6.1)                    
#>  processx      3.4.3   2020-07-05 [1] CRAN (R 3.6.2)                    
#>  ps            1.3.4   2020-08-11 [1] CRAN (R 3.6.2)                    
#>  purrr         0.3.4   2020-04-17 [1] CRAN (R 3.6.1)                    
#>  R6            2.4.1   2019-11-12 [1] CRAN (R 3.6.1)                    
#>  Rcpp          1.0.5   2020-07-06 [1] CRAN (R 3.6.2)                    
#>  remotes       2.2.0   2020-07-21 [1] CRAN (R 3.6.2)                    
#>  rlang         0.4.8   2020-10-08 [1] CRAN (R 3.6.2)                    
#>  rmarkdown     2.4.6   2020-10-20 [1] Github (rstudio/rmarkdown@7239cea)
#>  rprojroot     1.3-2   2018-01-03 [1] CRAN (R 3.6.0)                    
#>  scales        1.1.1   2020-05-11 [1] CRAN (R 3.6.2)                    
#>  sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 3.6.0)                    
#>  sf          * 0.9-5   2020-07-14 [1] CRAN (R 3.6.2)                    
#>  stringi       1.5.3   2020-09-09 [1] CRAN (R 3.6.2)                    
#>  stringr       1.4.0   2019-02-10 [1] CRAN (R 3.6.0)                    
#>  testthat      2.3.2   2020-03-02 [1] CRAN (R 3.6.0)                    
#>  tibble      * 3.0.3   2020-07-10 [1] CRAN (R 3.6.2)                    
#>  tidyselect    1.1.0   2020-05-11 [1] CRAN (R 3.6.2)                    
#>  units         0.6-7   2020-06-13 [1] CRAN (R 3.6.2)                    
#>  usethis       1.6.1   2020-04-29 [1] CRAN (R 3.6.2)                    
#>  vctrs         0.3.4   2020-08-29 [1] CRAN (R 3.6.2)                    
#>  withr         2.2.0   2020-04-20 [1] CRAN (R 3.6.2)                    
#>  xfun          0.18    2020-09-29 [1] CRAN (R 3.6.2)                    
#>  xml2          1.3.2   2020-04-23 [1] CRAN (R 3.6.2)                    
#>  yaml          2.2.1   2020-02-01 [1] CRAN (R 3.6.0)                    
#> 
#> [1] /Library/Frameworks/R.framework/Versions/3.6/Resources/library
@yutannihilation
Copy link
Member

This seems like a bug, insofar as the behavior of geom_sf() should be consistent with that of geom_point().

Technically, I'm not sure if this should be called a bug; because geom_sf() uses stat_sf(), and it processes the data group by group, which is a common behaviour among Stats, it's ordered by group. But I agree with you. It's good if we can fix this.

A simpler version of the reprex:

library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
library(ggplot2)
library(patchwork)

pt <- st_sfc(
  st_point(c(-1,  1)),
  st_point(c( 0,  1)),
  st_point(c( 1,  1)),
  st_point(c( 1,  0)),
  st_point(c( 1, -1)),
  st_point(c( 0, -1)),
  st_point(c(-1, -1)),
  st_point(c(-1,  0))
)

sf <- st_sf(
  g = rep(c("a", "b"), times = 4),
  geometry = pt
)

non_sf <- dplyr::mutate(sf, tibble::as_tibble(st_coordinates(sf)))

p1 <- ggplot(non_sf) +
  geom_point(aes(X, Y, colour = g), size = 65) +
  guides(colour = "none") +
  coord_equal() +
  ggtitle("Expected")

p2 <- ggplot(sf) +
  geom_sf(aes(colour = g), size = 65) +
  guides(colour = "none") +
  ggtitle("Actual")

p1 * p2

Created on 2021-02-25 by the reprex package (v1.0.0)

@yutannihilation
Copy link
Member

A possible fix could be to wrap compute_panel() and reorder the data with the original order. Here's my quick attempt:

yutannihilation@3281002

library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
devtools::load_all("~/repo/ggplot2")
#> Loading ggplot2
library(patchwork)

pt <- st_sfc(
  st_point(c(-1,  1)),
  st_point(c( 0,  1)),
  st_point(c( 1,  1)),
  st_point(c( 1,  0)),
  st_point(c( 1, -1)),
  st_point(c( 0, -1)),
  st_point(c(-1, -1)),
  st_point(c(-1,  0))
)

sf <- st_sf(
  g = rep(c("a", "b"), times = 4),
  geometry = pt
)

non_sf <- dplyr::mutate(sf, tibble::as_tibble(st_coordinates(sf)))

p1 <- ggplot(non_sf) +
  geom_point(aes(X, Y, colour = g), size = 65) +
  guides(colour = "none") +
  coord_equal() +
  ggtitle("Expected")

p2 <- ggplot(sf) +
  geom_sf(aes(colour = g), size = 65) +
  guides(colour = "none") +
  ggtitle("Actual")

p1 * p2

Created on 2021-02-25 by the reprex package (v1.0.0)

@thomasp85 thomasp85 added feature a feature request or enhancement layers 📈 labels Mar 24, 2021
@teunbrand
Copy link
Collaborator

An workaround can also be to explicitly set a single group, so that the stat co-processes all points at the same time. The convention is to set the group to -1, but it can be any grouping.

library(sf)
library(ggplot2)
library(tibble)

nc <- sf::st_read(
  system.file("shape/nc.shp", package = "sf"),
  quiet = TRUE)

N <- 10000
sampled_points <- st_sample(
  st_transform(nc, 32119), # NC state plane, m
  size = N)

labeled_points <- st_sf(
  geometry = sampled_points,
  fruit = sample(
    c("Apple", "Banana", "Cherry"),
    size = N,
    replace = TRUE))

ggplot(data = labeled_points) +
  geom_sf(aes(colour = fruit, group = -1), size = I(3))

Created on 2022-12-30 by the reprex package (v2.0.1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement layers 📈
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants