Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

geom_sf() "missing aesthetics: geometry" with tibble but not data frame #3391

Closed
frankhecker opened this issue Jul 4, 2019 · 8 comments
Closed

Comments

@frankhecker
Copy link

After using RStudio today to update all my packages, when using ggplot() and geom_sf() to plot geospatial data contained in a tibble I now get the error message "Error: stat_sf requires the following missing aesthetics: geometry". The error does not occur when using a regular data frame.

This is my first time submitting an issue for the tidyverse or R. I could not get reprex() to work, and could not understand how it is supposed to work, so I hope the following is what you are looking for. In my testing the first ggplot() succeeds, while the second fails with the error above.

library(ggplot2)
library(tibble)
library(sf)
library(maps)
us <- st_as_sf(map("state", plot = FALSE, fill = TRUE))
us_tbl <- as_tibble(us)
ggplot(data = us) + geom_sf()
ggplot(data = us_tbl) + geom_sf()

I am running RStudio Server on Ubuntu 18.04 LTS. When I load the sf library it produces the message "Linking to GEOS 3.6.2, GDAL 2.2.3, PROJ 4.9.3".

@clauswilke
Copy link
Member

The conversion to tibble strips the special geospatial information that st_as_sf() had added, and ggplot2 uses that information to automatically find the geometry column.

If you need to work with tibbles like that, you can always map the geometry column manually.

library(ggplot2)
library(tibble)
library(sf)
#> Linking to GEOS 3.6.1, GDAL 2.1.3, PROJ 4.9.3
library(maps)
us <- st_as_sf(map("state", plot = FALSE, fill = TRUE))
us_tbl <- as_tibble(us)

class(us)
#> [1] "sf"         "data.frame"
us
#> Simple feature collection with 49 features and 1 field
#> geometry type:  MULTIPOLYGON
#> dimension:      XY
#> bbox:           xmin: -124.6813 ymin: 25.12993 xmax: -67.00742 ymax: 49.38323
#> epsg (SRID):    4326
#> proj4string:    +proj=longlat +datum=WGS84 +no_defs
#> First 10 features:
#>                          geometry                   ID
#> 1  MULTIPOLYGON (((-87.46201 3...              alabama
#> 2  MULTIPOLYGON (((-114.6374 3...              arizona
#> 3  MULTIPOLYGON (((-94.05103 3...             arkansas
#> 4  MULTIPOLYGON (((-120.006 42...           california
#> 5  MULTIPOLYGON (((-102.0552 4...             colorado
#> 6  MULTIPOLYGON (((-73.49902 4...          connecticut
#> 7  MULTIPOLYGON (((-75.80231 3...             delaware
#> 8  MULTIPOLYGON (((-77.13731 3... district of columbia
#> 9  MULTIPOLYGON (((-85.01548 3...              florida
#> 10 MULTIPOLYGON (((-80.89018 3...              georgia

class(us_tbl)
#> [1] "tbl_df"     "tbl"        "data.frame"
us_tbl
#> # A tibble: 49 x 2
#>                                                     geometry ID            
#>                                           <MULTIPOLYGON [°]> <chr>         
#>  1 (((-87.46201 30.38968, -87.48493 30.37249, -87.52503 30.… alabama       
#>  2 (((-114.6374 35.01918, -114.6431 35.10512, -114.603 35.1… arizona       
#>  3 (((-94.05103 33.03675, -94.05103 33.30031, -94.05676 33.… arkansas      
#>  4 (((-120.006 42.00927, -120.006 41.20139, -120.006 39.700… california    
#>  5 (((-102.0552 40.00964, -102.061 40.00391, -102.0552 39.5… colorado      
#>  6 (((-73.49902 42.04937, -73.04066 42.04364, -73.01201 42.… connecticut   
#>  7 (((-75.80231 39.72889, -75.76221 39.72889, -75.74503 39.… delaware      
#>  8 (((-77.13731 38.94394, -77.06283 38.99551, -77.01699 38.… district of c…
#>  9 (((-85.01548 30.99702, -84.99829 30.96264, -84.97537 30.… florida       
#> 10 (((-80.89018 32.0398, -80.85007 32.02834, -80.84435 32.0… georgia       
#> # … with 39 more rows

ggplot(data = us_tbl, aes(geometry = geometry)) + geom_sf()

Created on 2019-07-03 by the reprex package (v0.3.0)

@clauswilke
Copy link
Member

Note: this may have changed from 3.1.0, because we have rewritten the code that finds geometry columns. I would argue, though, that if the data given to ggplot2 is not of type sf then ggplot2 should not try to auto-map the geometry column. One could ask whether as_tibble() should retain the sf class attribute, but that's a question for the tibble maintainers.

@yutannihilation
Copy link
Member

I wonder if the error message can contain some useful hints about what to do next (maybe a handy issue for tidy dev day?).

@frankhecker
Copy link
Author

Ah, thank you! In my existing code I wasn't converting to a tibble explicitly, but I was joining a tibble with a data frame produced by st_read(), which I guess converted the result to a tibble. I modified my existing code to add an explicit "aes(..., geometry = geometry)" and that fixed the problem.

My apologies for filing an issue over this. I did a fair amount of Internet searching to try to track down what could be causing the error, but couldn't find any useful information. And since my code was working under the previous version of ggplot2 I presumed that this was a bug introduced in the current version.

@clauswilke
Copy link
Member

@frankhecker I think it's a reasonable issue. The behavior is unexpected and confusing. It's just not obvious to me that ggplot2 is the right place to fix this.

@clauswilke
Copy link
Member

I dug a little deeper and I think the overall behavior is correct and doesn't require any changes. The join functions seem to maintain the class of their first argument, so if you need to join something into an sf data frame you just have to make sure that that data frame is the first argument. Depending on your application, you may need to use right_join() instead of left_join(). And as_tibble() converts whatever you give it into a tibble, so that's appropriate as well.

library(tidyverse)

# get first four lines of nc file
nc <- sf::st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE)[1:4, ] %>%
  mutate(NAME = as.character(NAME))

d <- tibble(
  NAME = c("Ashe", "Alleghany", "Surry", "Currituck"),
  value = rnorm(4)
)

# result is sf
nc2 <- left_join(nc, d)
#> Joining, by = "NAME"
class(nc2)
#> [1] "sf"         "data.frame"
ggplot(nc2) + geom_sf()

# result is tibble
nc3 <- left_join(d, nc)
#> Joining, by = "NAME"
class(nc3)
#> [1] "tbl_df"     "tbl"        "data.frame"
# doesn't work
ggplot(nc3) + geom_sf()
#> Error: stat_sf requires the following missing aesthetics: geometry

Created on 2019-07-04 by the reprex package (v0.3.0)

@frankhecker
Copy link
Author

I agree with your conclusions. Your point about join order is well-taken, I'll keep it in mind. Thanks for your attention to this issue, and thanks to all of you working on ggplot2!

@lock
Copy link

lock bot commented Jan 1, 2020

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Jan 1, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants