Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAP type is not working in R #61

Closed
nassuphis opened this issue Jan 4, 2024 · 5 comments · Fixed by #165
Closed

MAP type is not working in R #61

nassuphis opened this issue Jan 4, 2024 · 5 comments · Fixed by #165
Assignees
Labels
enhancement New feature or request

Comments

@nassuphis
Copy link

> con <- dbConnect(duckdb::duckdb())
> dbGetQuery(con, "SELECT map([1,2],['a','b']) AS x;")
Error: rapi_prepare: Unknown column type for prepare: MAP(INTEGER, VARCHAR)
@Tmonster
Copy link
Contributor

Tmonster commented Jan 4, 2024

Thanks! Tagging duckdb/duckdb#8859 since the two are similar issues

@nassuphis
Copy link
Author

the histogram(arg) aggregate function does not work for the same reason

 > tbl(con,sql("SELECT * FROM range(100) AS tt(x)")) %>%
+ summarise(h=sql("histogram(x)"))
Error in `collect()`:
! Failed to collect lazy table.
Caused by error:
! rapi_prepare: Unknown column type for prepare: MAP(BIGINT, UBIGINT)
Run `rlang::last_trace()` to see where the error occurred.

@krlmlr
Copy link
Collaborator

krlmlr commented Apr 24, 2024

Thanks, confirmed.

library(DBI)
con <- dbConnect(duckdb::duckdb())
dbGetQuery(con, "SELECT map([1,2],['a','b']) AS x;")
#> Error: rapi_prepare: Unknown column type for prepare: MAP(INTEGER, VARCHAR)

dbExecute(con, "COPY (SELECT map([1,2],['a','b']) AS x) TO 'map.parquet'")
#> [1] 1
parquet <- arrow::read_parquet("map.parquet")
tibble::as_tibble(parquet)
#> # A tibble: 1 × 1
#>                                                            x
#>   <list<
#>   tbl_df<
#>     key  : integer
#>     value: character
#>   >
#> >>
#> 1                                                    [2 × 2]
parquet$x[[1]]
#> # A tibble: 2 × 2
#>     key value
#>   <int> <chr>
#> 1     1 a    
#> 2     2 b

Created on 2024-04-24 with reprex v2.1.0

For reference, Arrow converts dictionaries to two-column data frames, this is what we should do here too.

Side note: The headers for list-of-data-frame columns look odd. I suspect we need to work around in pillar.

@grantmcdermott
Copy link

I mentioned this at the old repo but, for posterity, a relatively simple workaround is UNNEST(MAP_ENTRIES(..., recursive := TRUE)). This will coerce the map dictionary into a regular 2-D data.frame that R understands.

Using @krlmlr's example:

library(DBI)
con = dbConnect(duckdb::duckdb(), shutdown = TRUE)
dbGetQuery(
    con,
    "
    FROM (SELECT map([1,2],['a','b']) AS x)
    SELECT UNNEST(MAP_ENTRIES(x), recursive := TRUE)
    "
)
#>   key value
#> 1   1     a
#> 2   2     b

@hannes hannes self-assigned this May 8, 2024
@hannes
Copy link
Member

hannes commented May 8, 2024

PR is here: #165

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants