Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polygon/multipolygon columns: incorrect write, correct read #439

Closed
pachadotdev opened this issue Jul 31, 2023 · 7 comments
Closed

Polygon/multipolygon columns: incorrect write, correct read #439

pachadotdev opened this issue Jul 31, 2023 · 7 comments

Comments

@pachadotdev
Copy link
Contributor

Incorrect write

If we open a regular DBI connection, as in the next example, we won't correctly write to Postgres with Postgis correctly configured.

library(RPostgres)
library(sf)
library(archive)

con <- dbConnect(
  drv = Postgres(),
  dbname = "shiny"
)

download.file("https://www2.census.gov/geo/tiger/TIGER2014/TRACT/tl_2014_01_tract.zip", "tl_2014_01_tract.zip")

archive_extract("tl_2014_01_tract.zip")

d <- read_sf("tl_2014_01_tract.shp")

# this saves geometry as text
# sf::dbWriteTable(con, "carto", d, append = TRUE)
# RPostgres::dbWriteTable(con, "carto", d, append = TRUE)

# this saves geometry as geometry (not working)
# st_write(d, con, "carto", append = TRUE, layer = "carto", layer_options = c("GEOMETRY_NAME=geometry"))

# this saves geometry as geometry
st_write(
  d2,
  "PG:host=localhost port=5432 dbname=shiny user=xxx password=yyy",
  layer = "carto",
  layer_options = c("GEOMETRY_NAME=geometry"),
  append = TRUE
)

Correct read

library(ggplot2)
library(dplyr)

# Ggplot2 won't like the <pq_gmtry> but it's correct
d2 <- tbl(con, "carto") %>%
    filter(countyfp == "001") %>%
    collect()

glimpse(d2)

... TRUNCATED OUTPUT ...
$ countyfp <chr> "001", "001", "001", "001", "001", "001", "001", "001", "001"…
$ tractce  <chr> "021100", "021000", "020900", "020100", "020300", "020400", "…
$ aland    <dbl> 478168607, 386854992, 292754413, 9810181, 5349273, 6382705, 1…
$ geometry <pq_gmtry> 0106000020AD100000010000000103000000010000003B0A0000A437…
...

# convert pq_gmtry to multipolygon
d2 <- d2 %>%
  mutate(geometry = st_as_sfc(geometry))

# plot ok
ggplot() +
    geom_sf(data = d2, aes(fill = aland, geometry = geometry))

dbDisconnect(con)

Here's my session info:

─ Session info ────────────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.1 (2023-06-16)
 os       Linux Mint 21.2
 system   x86_64, linux-gnu
 ui       X11
 language en_CA:en
 collate  en_CA.UTF-8
 ctype    en_CA.UTF-8
 tz       America/Toronto
 date     2023-07-31
 pandoc   2.9.2.1 @ /usr/bin/pandoc

─ Packages ────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 package     * version    date (UTC) lib source
 bit           4.0.5      2022-11-15 [1] CRAN (R 4.3.1)
 bit64         4.0.5      2020-08-30 [1] CRAN (R 4.3.1)
 blob          1.2.4      2023-03-17 [1] CRAN (R 4.3.1)
 cachem        1.0.8      2023-05-01 [1] CRAN (R 4.3.1)
 callr         3.7.3      2022-11-02 [1] CRAN (R 4.3.1)
 class         7.3-22     2023-05-03 [4] CRAN (R 4.3.1)
 classInt      0.4-9      2023-02-28 [1] CRAN (R 4.3.1)
 cli           3.6.1      2023-03-23 [1] CRAN (R 4.3.1)
 colorspace    2.1-0      2023-01-23 [1] CRAN (R 4.3.1)
 crayon        1.5.2      2022-09-29 [1] CRAN (R 4.3.1)
 DBI           1.1.3      2022-06-18 [1] CRAN (R 4.3.1)
 dbplyr        2.3.3      2023-07-07 [1] CRAN (R 4.3.1)
 devtools    * 2.4.5      2022-10-11 [1] CRAN (R 4.3.1)
 digest        0.6.33     2023-07-07 [1] CRAN (R 4.3.1)
 dplyr       * 1.1.2      2023-04-20 [1] CRAN (R 4.3.1)
 e1071         1.7-13     2023-02-01 [1] CRAN (R 4.3.1)
 ellipsis      0.3.2      2021-04-29 [1] CRAN (R 4.3.1)
 fansi         1.0.4      2023-01-22 [1] CRAN (R 4.3.1)
 farver        2.1.1      2022-07-06 [1] CRAN (R 4.3.1)
 fastmap       1.1.1      2023-02-24 [1] CRAN (R 4.3.1)
 fs            1.6.2      2023-04-25 [1] CRAN (R 4.3.1)
 generics      0.1.3      2022-07-05 [1] CRAN (R 4.3.1)
 ggplot2     * 3.4.2      2023-04-03 [1] CRAN (R 4.3.1)
 glue          1.6.2      2022-02-24 [1] CRAN (R 4.3.1)
 gtable        0.3.3      2023-03-21 [1] CRAN (R 4.3.1)
 hms           1.1.3      2023-03-21 [1] CRAN (R 4.3.1)
 htmltools     0.5.5      2023-03-23 [1] CRAN (R 4.3.1)
 htmlwidgets   1.6.2      2023-03-17 [1] CRAN (R 4.3.1)
 httpuv        1.6.11     2023-05-11 [1] CRAN (R 4.3.1)
 jsonlite      1.8.7      2023-06-29 [1] CRAN (R 4.3.1)
 KernSmooth    2.23-22    2023-07-10 [4] CRAN (R 4.3.1)
 labeling      0.4.2      2020-10-20 [1] CRAN (R 4.3.1)
 later         1.3.1      2023-05-02 [1] CRAN (R 4.3.1)
 lifecycle     1.0.3      2022-10-07 [1] CRAN (R 4.3.1)
 lubridate     1.9.2      2023-02-10 [1] CRAN (R 4.3.1)
 magrittr      2.0.3      2022-03-30 [1] CRAN (R 4.3.1)
 memoise       2.0.1      2021-11-26 [1] CRAN (R 4.3.1)
 mime          0.12       2021-09-28 [1] CRAN (R 4.3.1)
 miniUI        0.1.1.1    2018-05-18 [1] CRAN (R 4.3.1)
 munsell       0.5.0      2018-06-12 [1] CRAN (R 4.3.1)
 pillar        1.9.0      2023-03-22 [1] CRAN (R 4.3.1)
 pkgbuild      1.4.2      2023-06-26 [1] CRAN (R 4.3.1)
 pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 4.3.1)
 pkgload       1.3.2.1    2023-07-08 [1] CRAN (R 4.3.1)
 prettyunits   1.1.1      2020-01-24 [1] CRAN (R 4.3.1)
 processx      3.8.2      2023-06-30 [1] CRAN (R 4.3.1)
 profvis       0.3.8      2023-05-02 [1] CRAN (R 4.3.1)
 promises      1.2.0.1    2021-02-11 [1] CRAN (R 4.3.1)
 proxy         0.4-27     2022-06-09 [1] CRAN (R 4.3.1)
 ps            1.7.5      2023-04-18 [1] CRAN (R 4.3.1)
 purrr         1.0.1      2023-01-10 [1] CRAN (R 4.3.1)
 R6            2.5.1      2021-08-19 [1] CRAN (R 4.3.1)
 Rcpp          1.0.11     2023-07-06 [1] CRAN (R 4.3.1)
 remotes       2.4.2.1    2023-07-18 [1] CRAN (R 4.3.1)
 rlang         1.1.1      2023-04-28 [1] CRAN (R 4.3.1)
 RPostgres   * 1.4.5.9012 2023-07-22 [1] Github (r-dbi/RPostgres@58a052b)
 scales        1.2.1      2022-08-20 [1] CRAN (R 4.3.1)
 sessioninfo   1.2.2      2021-12-06 [1] CRAN (R 4.3.1)
 sf          * 1.0-14     2023-07-11 [1] CRAN (R 4.3.1)
 shiny         1.7.4.1    2023-07-06 [1] CRAN (R 4.3.1)
 stringi       1.7.12     2023-01-11 [1] CRAN (R 4.3.1)
 stringr       1.5.0      2022-12-02 [1] CRAN (R 4.3.1)
 tibble        3.2.1      2023-03-20 [1] CRAN (R 4.3.1)
 tidyselect    1.2.0      2022-10-10 [1] CRAN (R 4.3.1)
 timechange    0.2.0      2023-01-11 [1] CRAN (R 4.3.1)
 units         0.8-2      2023-04-27 [1] CRAN (R 4.3.1)
 urlchecker    1.0.1      2021-11-30 [1] CRAN (R 4.3.1)
 usethis     * 2.2.2      2023-07-06 [1] CRAN (R 4.3.1)
 utf8          1.2.3      2023-01-31 [1] CRAN (R 4.3.1)
 vctrs         0.6.3      2023-06-14 [1] CRAN (R 4.3.1)
 withr         2.5.0      2022-03-03 [1] CRAN (R 4.3.1)
 xtable        1.8-4      2019-04-21 [1] CRAN (R 4.3.1)

 [1] /home/pacha/R/x86_64-pc-linux-gnu-library/4.3
 [2] /usr/local/lib/R/site-library
 [3] /usr/lib/R/site-library
 [4] /usr/lib/R/library
@dpprdan
Copy link
Contributor

dpprdan commented Aug 23, 2023

This only concerns the sf::dbWriteTable() method, so I think this is better handled there than here.

However, this is partly caused by RPostgres::dbWriteTable() stop()ing if append && !is.null(field.types)

if (append && !is.null(field.types)) {
stopc("Cannot specify `field.types` with `append = TRUE`")
}

This also happens when the table does not (yet) exist, i.e. when there is nothing to append to.

Do you think it would be possible to move this check a few lines down and only stop if the table is found, @krlmlr?

found <- dbExistsTable(conn, name)

Right now, dbWriteTable will create the table anyway if no field.types are provided.

if (!found || overwrite) {
if (is.null(field.types)) {
combined_field_types <- lapply(value, dbDataType, dbObj = conn)
} else {

The downside is that dbWriteTable(append = TRUE, field.types = "something") will work when the table does not exist but stop when it does. This might be confusing. However, we could ameliorate this by issuing a message "You wanted to append, but the table does not exist yet so we created it".

Of course, third-party methods have to provide custom field.types for their datatypes, like sfc, so field.types cannot be NULL for them. So they cannot create a table currently if append = TRUE but the table does not exist.

@krlmlr
Copy link
Member

krlmlr commented Aug 23, 2023

Thanks, missed that. IIUC, we can fix this by fixing sf::dbWriteTable()?

@dpprdan
Copy link
Contributor

dpprdan commented Aug 23, 2023

Thanks, missed that. IIUC, we can fix this by fixing sf::dbWriteTable()?

Yes. But the fix is a bit ugly (IMO, i.e. overriding function arguments).

IMHO it would be best, i.e. cleanest, if append were only possible if the table already exists, but that's probably too much of a breaking change.

@krlmlr
Copy link
Member

krlmlr commented Aug 23, 2023

There are new generics dbCreateTable() and dbAppendTable(), does sf support these?

@dpprdan
Copy link
Contributor

dpprdan commented Aug 23, 2023

Ha, I just had a similar thought, i.e. refactor sf::dbWriteTable() using dbCreateTable() and dbAppendTable().

So no, sf does not support/use them yet.

Anyway, I am not convinced by my "create table if append = TRUE" proposal above myself anymore, so this can be closed IMO.

@dpprdan
Copy link
Contributor

dpprdan commented Dec 21, 2023

Fixed in {sf}: r-spatial/sf#2223

@krlmlr
Copy link
Member

krlmlr commented Apr 1, 2024

Closing, fixed downstream. Thanks @dpprdan!

@krlmlr krlmlr closed this as completed Apr 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants