Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

st_write silently drops boolean columns over GDAL DB connection #722

Closed
dpprdan opened this issue Apr 24, 2018 · 8 comments
Closed

st_write silently drops boolean columns over GDAL DB connection #722

dpprdan opened this issue Apr 24, 2018 · 8 comments

Comments

@dpprdan
Copy link
Contributor

dpprdan commented Apr 24, 2018

As mentioned in #592, st_write silently drops boolean columns on a GDAL DB connection:

suppressPackageStartupMessages(library(sf))

nc_bool <- 
  st_read(
    system.file("shape/nc.shp", package="sf"), 
    stringsAsFactors = FALSE, 
    quiet = TRUE
  )[, c("AREA", "geometry")]

nc_bool$large_area <- nc_bool$AREA >= 0.15


st_write(nc_bool, "PG:dbname=postgis", "nc_bool", delete_layer = TRUE)
#> Deleting layer `nc_bool' using driver `PostgreSQL'
#> Updating layer `nc_bool' to data source `PG:dbname=postgis' using driver `PostgreSQL'
#> features:       100
#> fields:         1
#> geometry type:  Multi Polygon
st_read("PG:dbname=postgis", "nc_bool", quiet = TRUE) %>% colnames()
#> [1] "area"         "wkb_geometry"

There should be a warning when the driver drops a column. IMO there should also be a warning in the help file re: GDAL and boolean columns (#579) and the use of the GDAL driver possibly even discouraged for DB connections when there are good ‘native’ R DB drivers available (i.e. at least for PostgreSQL)

Session info
devtools::session_info()
#> Session info -------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.4.4 (2018-03-15)
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language en                          
#>  collate  German_Germany.1252         
#>  tz       Europe/Berlin               
#>  date     2018-04-24
#> Packages -----------------------------------------------------------------
#>  package     * version date       source                               
#>  backports     1.1.2   2017-12-13 CRAN (R 3.4.3)                       
#>  base        * 3.4.4   2018-03-15 local                                
#>  class         7.3-14  2015-08-30 CRAN (R 3.4.4)                       
#>  classInt      0.2-3   2018-04-16 CRAN (R 3.4.4)                       
#>  compiler      3.4.4   2018-03-15 local                                
#>  datasets    * 3.4.4   2018-03-15 local                                
#>  DBI           0.8     2018-03-02 CRAN (R 3.4.3)                       
#>  devtools      1.13.5  2018-02-18 CRAN (R 3.4.3)                       
#>  digest        0.6.15  2018-01-28 CRAN (R 3.4.3)                       
#>  e1071         1.6-8   2017-02-02 CRAN (R 3.4.2)                       
#>  evaluate      0.10.1  2017-06-24 CRAN (R 3.4.0)                       
#>  graphics    * 3.4.4   2018-03-15 local                                
#>  grDevices   * 3.4.4   2018-03-15 local                                
#>  grid          3.4.4   2018-03-15 local                                
#>  htmltools     0.3.6   2017-04-28 CRAN (R 3.4.0)                       
#>  knitr         1.20    2018-02-20 CRAN (R 3.4.3)                       
#>  magrittr      1.5     2014-11-22 CRAN (R 3.4.2)                       
#>  memoise       1.1.0   2018-01-08 Github (hadley/memoise@611cfad)      
#>  methods     * 3.4.4   2018-03-15 local                                
#>  Rcpp          0.12.16 2018-03-13 CRAN (R 3.4.4)                       
#>  rmarkdown     1.9.8   2018-04-05 Github (rstudio/rmarkdown@8cfc878)   
#>  RPostgreSQL   0.6-3   2018-01-24 Github (tomoakin/RPostgreSQL@f93cb17)
#>  rprojroot     1.3-2   2018-01-03 CRAN (R 3.4.3)                       
#>  sf          * 0.6-2   2018-04-23 Github (r-spatial/sf@5a708f0)        
#>  spData        0.2.8.3 2018-03-25 CRAN (R 3.4.4)                       
#>  stats       * 3.4.4   2018-03-15 local                                
#>  stringi       1.1.7   2018-03-12 CRAN (R 3.4.4)                       
#>  stringr       1.3.0   2018-02-19 CRAN (R 3.4.3)                       
#>  tools         3.4.4   2018-03-15 local                                
#>  udunits2      0.13    2016-11-17 CRAN (R 3.4.1)                       
#>  units         0.5-1   2018-01-08 CRAN (R 3.4.3)                       
#>  utils       * 3.4.4   2018-03-15 local                                
#>  withr         2.1.2   2018-03-15 CRAN (R 3.4.4)                       
#>  yaml          2.1.18  2018-03-08 CRAN (R 3.4.4)
@edzer
Copy link
Member

edzer commented Apr 24, 2018

Thanks! This tries to use the FieldDefn->GetSubType() logic of GDAL; I see logicals round tripped coming back as integers, but I can't see whether in PostGIS they are int or booleans (they print as 0/1: does this indicate they are not booleans?) I don't even know whether the gdal/postgis drivers supports all this, couldn't find any mentioning in the driver docs.

@edzer edzer closed this as completed in 3460f05 Apr 24, 2018
@edzer
Copy link
Member

edzer commented Apr 24, 2018

Sorry my bad; this seems to work now:

suppressPackageStartupMessages(library(sf))

nc_bool <- 
  st_read(
    system.file("shape/nc.shp", package="sf"), 
    stringsAsFactors = FALSE, 
    quiet = TRUE
  )[, c("AREA", "geometry")]

nc_bool$large_area <- nc_bool$AREA >= 0.15
st_write(nc_bool, "PG:dbname=postgis", "nc_bool", delete_layer = TRUE)
# Deleting layer `nc_bool' using driver `PostgreSQL'
# Updating layer `nc_bool' to data source `PG:dbname=postgis' using driver `PostgreSQL'
# features:       100
# fields:         2
# geometry type:  Multi Polygon
st_read("PG:dbname=postgis", "nc_bool", quiet = TRUE) %>% sapply(class)
# $area
# [1] "numeric"

# $large_area
# [1] "logical"

# $wkb_geometry
# [1] "sfc_MULTIPOLYGON" "sfc" 

@dpprdan
Copy link
Contributor Author

dpprdan commented Apr 24, 2018

👍
So, GDAL apparently supports logical/boolean type now, also for e.g. geopackage as mentioned in #579? Is that already documented somewhere?

suppressPackageStartupMessages(library(sf))
suppressPackageStartupMessages(library(dplyr))

nc_bool <- st_read(
  system.file("shape/nc.shp", package = "sf"),
  quiet = TRUE
)[, c("AREA", "geometry")]
nc_bool$large_area <- nc_bool$AREA >= 0.15
st_write(obj = nc_bool, dsn = "nc_bool.gpkg", delete_layer = TRUE)
#> Deleting layer `nc_bool' using driver `GPKG'
#> Updating layer `nc_bool' to data source `C:\Temp\R\Rtmp0UjShP\nc_bool.gpkg' using driver `GPKG'
#> features:       100
#> fields:         2
#> geometry type:  Multi Polygon
st_read(dsn = "nc_bool.gpkg", quiet = TRUE) %>% sapply(class)
#> $AREA
#> [1] "numeric"
#> 
#> $large_area
#> [1] "logical"
#> 
#> $geom
#> [1] "sfc_MULTIPOLYGON" "sfc"

@edzer
Copy link
Member

edzer commented Apr 25, 2018

Documented in sf, or in gdal? The gdal api docs mention it is a new feature in 2.0, but I didn't see it in the 2.0 release notes.

@dpprdan
Copy link
Contributor Author

dpprdan commented Apr 25, 2018

I meant documented in sf. Something must have happened between your comment from 3 dec and now and I am wondering where an unsuspecting sf user could find out that booleans are now supported by st_write with the GDAL driver? Or in general, where can a user find out, which field types are (not) supported?

@edzer
Copy link
Member

edzer commented Apr 25, 2018

First of all, I don't think any of this (neither support, nor non-support of logical through sf-gdal) ended up in user documentation; github comments simply reflect what I know, or think, at a particular moment in time. It hadn't occurred to me that beyond an OGRFieldType, GDAL would also have an OGRFieldSubType because I didn't see it in the table I pointed to (and it seems back-fitted, IMHO). I think I started off by doing what rgdal did, then added time/datetime support; never had given logical a thought until you raised it. Do you see other types we should look into (I don't)?

@dpprdan
Copy link
Contributor Author

dpprdan commented Apr 25, 2018

No, I don't know any other types we should look into at the moment, but I'll let you know should I run into one 😉 . I do not mean to make a big thing of this. I was just surprised that this works now in sf and genuinely wondered if I could have caught that somewhere. Maybe a line in NEWS would be appropriate, though?

@edzer
Copy link
Member

edzer commented Apr 25, 2018

I was surprised too, but it seems an RTFM case (for me, that is). Will add to NEWS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants