Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically reconnect after de-serialization #72

Closed
mllg opened this issue May 26, 2018 · 3 comments
Closed

Automatically reconnect after de-serialization #72

mllg opened this issue May 26, 2018 · 3 comments
Labels
bug an unexpected problem or unintended behavior

Comments

@mllg
Copy link

mllg commented May 26, 2018

In one of my packages I have an object which internally stores a tbl / tbl_dbi. For parallelization, I now need to serialize this object to the file system so that a computational node can read the object, query the database and start its computation.

Unfortunately, as an external pointer, the connection becomes invalid during serialization / deserialization. My workaround for this is to reestablish the connection as illustrated in this code snippet:

requireNamespace("DBI")
requireNamespace("RSQLite")
requireNamespace("dplyr")
requireNamespace("dbplyr")
requireNamespace("pool")

# create a tbl on iris stored in a SQLite db 
dbname = tempfile("db_", fileext = ".sqlite")
con = DBI::dbConnect(RSQLite::SQLite(), dbname)
# con = pool::dbPool(drv = RSQLite::SQLite(), dbname = dbname)
dplyr::copy_to(con, iris, "iris", temporary = FALSE)
data = dplyr::tbl(con, "iris")

# destroy the connection
tmp = tempfile()
saveRDS(data, file = tmp)
data = readRDS(tmp)

# verify it is disconnected
head(data) # -> error / invalid pointer
data$src$con
DBI::dbIsValid(data$src$con)

# reconnect
data$src$con = DBI::dbConnect(RSQLite::SQLite(), dbname)

head(data)

As far as I understand it, pool as a connection manager should automatically reconnect, but I just get the same error if I replace the connection with a pool object ("external pointer is not valid").

@hadley
Copy link
Member

hadley commented Jan 5, 2023

Here's a minimal reprex that I think reproduces the problem

library(dbplyr)
library(dplyr, warn.conflicts = FALSE)
library(pool)

dbname <- tempfile("db_", fileext = ".sqlite")
con <- dbPool(RSQLite::SQLite(), dbname = dbname)
DBI::dbWriteTable(con, "iris", iris)

# destroy the connection
tmp <- tempfile()
saveRDS(con, file = tmp)
con2 <- readRDS(tmp)

DBI::dbReadTable(con2, "iris")
#> Error: external pointer is not valid

con2$fetch()
#> <SQLiteConnection>
#>   DISCONNECTED
str(DBI::dbReadTable(con2, "iris"))
#> 'data.frame':    150 obs. of  5 variables:
#>  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
#>  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
#>  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
#>  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
#>  $ Species     : chr  "setosa" "setosa" "setosa" "setosa" ...

Created on 2023-01-05 with reprex v2.0.2

It looks like we're missing a tryCatch() somewhere because it's certainly possible to manually tickle the corrupted connection to reconnect.

@hadley hadley added the bug an unexpected problem or unintended behavior label Jan 5, 2023
@hadley
Copy link
Member

hadley commented Jan 5, 2023

Ah, the problem is that by default validation is only performed every validationInterval seconds, which defaults to 600 (i.e. 10 minutes). If you demand that the validation happens on every access, this works:

library(dbplyr)
library(dplyr, warn.conflicts = FALSE)
library(pool)

dbname <- tempfile("db_", fileext = ".sqlite")
con <- dbPool(RSQLite::SQLite(), dbname = dbname, validationInterval = 0)
DBI::dbWriteTable(con, "iris", iris)

# destroy the connection
tmp <- tempfile()
saveRDS(con, file = tmp)
con2 <- readRDS(tmp)

nrow(DBI::dbReadTable(con2, "iris"))
#> Warning in connection_release(conn@ptr): Already disconnected
#> Warning: It wasn't possible to activate and/or validate the object. Trying again
#> with a new object.
#> [1] 150

Created on 2023-01-05 with reprex v2.0.2

@hadley
Copy link
Member

hadley commented Jan 9, 2023

Pool is working as documented here, and since there's no easy way to control serialisation of an object, unfortunately I don't think there's much else we can do here.

@hadley hadley closed this as completed Jan 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

2 participants