Skip to content
This repository has been archived by the owner on Apr 14, 2018. It is now read-only.

setdiff fails when applied to same remote tbl #73

Closed
imanuelcostigan opened this issue Jul 9, 2016 · 7 comments
Closed

setdiff fails when applied to same remote tbl #73

imanuelcostigan opened this issue Jul 9, 2016 · 7 comments
Assignees
Milestone

Comments

@imanuelcostigan
Copy link
Owner

imanuelcostigan commented Jul 9, 2016

Expected behaviour

Using SQLite as example

sl <- src_sqlite(tempfile(), create = TRUE)
data_frame(a = 1:3, b = letters[1:3]) %>%
  copy_to(sl, ., "df1")
setdiff(tbl(sl, "df1"), tbl(sl, "df1"))
#> Source:   query [?? x 2]\
#> Database: sqlite 3.8.6 
#> [/var/folders/9f/kz67r0rn3qb9tx_rmgbps94r0000gq/T//RtmpG1H4kx/file1db865e13c7c]

#> # ... with at least 0 rows total, and 2 more variables: a <int>, b <chr>

Actual behaviour

Using test server

ss <- RSQLServer::src_sqlserver("TEST", database = "DBItest")
airlines_tbl <- tbl(ss, "airlines")
setdiff(airlines_tbl, airlines_tbl) 
#> Source:   query [?? x 2]
#> Database: SQLServer 13.0.1100 [SQLuser@13.73.118.48:1433/DBItest]
#> 
#> Error: Each variable must be a 1d atomic vector or list.
#> Problem variables: 'carrier', 'name'

Steps to reproduce behaviour

See above.

@imanuelcostigan imanuelcostigan added this to the v0.3 milestone Jul 9, 2016
@imanuelcostigan imanuelcostigan self-assigned this Jul 9, 2016
@imanuelcostigan
Copy link
Owner Author

Traceback:

d> traceback()
18: stop(..., call. = FALSE, domain = NA)
17: stopc(problem, ".\n", "Problem variables: ", format_n(vars))
16: invalid_df("Each variable must be a 1d atomic vector or list", 
        x, !is_1d)
15: check_tibble(x)
14: list_to_tibble(x, validate, raw_rownames(x))
13: as_data_frame.data.frame(data)
12: as_data_frame(data)
11: tbl_df(data)
10: grouped_df(out, groups(x))
9: collect.tbl_sql(x, n = n)
8: collect(x, n = n)
7: as.data.frame(collect(x, n = n))
6: as.data.frame.tbl_sql(head(x, n))
5: as.data.frame(head(x, n))
4: trunc_mat(x, n = n, width = width)
3: print(trunc_mat(x, n = n, width = width))
2: print.tbl_sql(x)
1: function (x, ...) 
   UseMethod("print")(x)

@imanuelcostigan
Copy link
Owner Author

This isn't a RSQLServer bug, but one upstream:

sd <- setdiff(airlines_tbl, airlines_tbl) %>% collect()
sd
# A tibble: 0 x 2
# ... with 2 variables: carrier <chr>, name <chr>

@imanuelcostigan
Copy link
Owner Author

imanuelcostigan commented Jul 23, 2016

  • Take a look at what happens when trying to return an empty resultset (fields defined, but no records)

@imanuelcostigan
Copy link
Owner Author

tbl(ss, "airlines") %>% filter(carrier == 'ZZ') returns same error message:

Error: Each variable must be a 1d atomic vector or list.
Problem variables: 'carrier', 'name'

@imanuelcostigan
Copy link
Owner Author

Though:

dbGetQuery(ss$con, "SELECT * FROM AIRLINES WHERE carrier = 'ZZ'")
[1] carrier name   
<0 rows> (or 0-length row.names)

@imanuelcostigan
Copy link
Owner Author

imanuelcostigan commented Jul 23, 2016

Debugging dplyr::grouped_df when executing tbl(ss, "airlines") %>% filter(carrier == 'ZZ'), executes tbl_df(data) where data is

> str(data)
'data.frame':   0 obs. of  2 variables:
 $ carrier: NULL
 $ name   : NULL

Whereas same debug process yields following for SQLite example above:

> str(data)
'data.frame':   0 obs. of  2 variables:
 $ a: int 
 $ b: chr 

Looks like the process of creating a tibble needs to specify the return type of every field of the form data.frame(a = integer(), b = numeric())

@krlmlr
Copy link

krlmlr commented Aug 2, 2016

Are you sure this is an upstream bug? A zero-row data frame with correctly typed columns is returned by e.g. a SQLite data source.

> iris %>% dplyr::copy_to(dplyr::src_memdb(), .) %>% dplyr::filter(Petal.Width > 10)
Source:   query [?? x 5]
Database: sqlite 3.11.1 [:memory:]

# ... with 5 variables: Sepal.Length <dbl>, Sepal.Width <dbl>, Petal.Length <dbl>, Petal.Width <dbl>, Species <chr>

imanuelcostigan added a commit that referenced this issue Aug 7, 2016
Work around for #73 while upstream discussion continues.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants