New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incompatible with dplyr 0.7.8 rc #338

Closed
krlmlr opened this Issue Nov 9, 2018 · 6 comments

Comments

Projects
None yet
3 participants
@krlmlr

krlmlr commented Nov 9, 2018

About to release dplyr 0.7.8. Noticed that valr needs the build_index_cpp() export with a proper signature, but even after fixing the signature I'm seeing a strange error I don't understand.

dplyr 0.7.8 is here: https://github.com/tidyverse/dplyr/tree/r-0.7.8

The build_index_cpp() export has been changed as early as 0.7.5, no idea why the proble wasn't caught earlier.

library(valr)

x <- tibble::tribble(
  ~chrom, ~start, ~end,
  "chr1", 25, 50,
  "chr1", 100, 125
)

y <- tibble::tribble(
  ~chrom, ~start, ~end,
  "chr1", 30, 75
)

bed_glyph(bed_intersect(x, y))
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
#> Error in merge_impl(res, max_dist, collapse = FALSE): Need at least one column for `hash()`

Created on 2018-11-09 by the reprex package (v0.2.1.9000)

kriemo added a commit to kriemo/valr that referenced this issue Nov 9, 2018

@kriemo

This comment has been minimized.

Collaborator

kriemo commented Nov 9, 2018

Thanks for bringing this error to our attention.

I think that this error is unrelated to build_index_cpp(), as we package the dplyr headers in valr directly.

@jayhesselberth I think I have a fix for this bug and will submit a PR shortly.

The problem is due to as_tibble dropping groups if a grouped tibble is passed (which can happen in many bed_*() functions ).

library(valr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

x <- tibble::tribble(
  ~chrom, ~start, ~end, ~group,
  "chr1", 25, 50, "A",
  "chr1", 100, 125, "B"
)

x <- group_by(x, group)
x
#> # A tibble: 2 x 4
#> # Groups:   group [2]
#>   chrom start   end group
#>   <chr> <dbl> <dbl> <chr>
#> 1 chr1     25    50 A    
#> 2 chr1    100   125 B

as_tibble(x)
#> # A tibble: 2 x 4
#>   chrom start   end group
#>   <chr> <dbl> <dbl> <chr>
#> 1 chr1     25    50 A    
#> 2 chr1    100   125 B

as.tbl_interval(x)
#> # A tibble: 2 x 4
#>   chrom start   end group
#>   <chr> <dbl> <dbl> <chr>
#> 1 chr1     25    50 A    
#> 2 chr1    100   125 B

Created on 2018-11-09 by the reprex package (v0.2.1)

@kriemo

This comment has been minimized.

Collaborator

kriemo commented Nov 9, 2018

Just to clarify, the current CRAN version of tibble doesn't drop groups (see below), but the current release on github does (see above).

library(valr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

x <- tibble::tribble(
  ~chrom, ~start, ~end, ~group,
  "chr1", 25, 50, "A",
  "chr1", 100, 125, "B"
)

x <- group_by(x, group)
x
#> # A tibble: 2 x 4
#> # Groups:   group [2]
#>   chrom start   end group
#>   <chr> <dbl> <dbl> <chr>
#> 1 chr1     25    50 A    
#> 2 chr1    100   125 B

as_tibble(x)
#> # A tibble: 2 x 4
#> # Groups:   group [2]
#>   chrom start   end group
#>   <chr> <dbl> <dbl> <chr>
#> 1 chr1     25    50 A    
#> 2 chr1    100   125 B

as.tbl_interval(x)
#> # A tibble: 2 x 4
#> # Groups:   group [2]
#>   chrom start   end group
#>   <chr> <dbl> <dbl> <chr>
#> 1 chr1     25    50 A    
#> 2 chr1    100   125 B

packageVersion("tibble")
#> [1] '1.4.2'

Created on 2018-11-09 by the reprex package (v0.2.1)

@krlmlr

This comment has been minimized.

krlmlr commented Nov 10, 2018

Good point, I checked with the dev version of tibble indeed. Need to mention that in the breaking changes, or perhaps un-break (for now).

as.data.frame() already removes the "tibble" class from a tibble, so it feels consistent that as_tibble() drops grouping. On the other hand, this is different from how OO languages (like C++ and Java) work.

One way to handle this would be a custom coercer in valr that forwards to as_tibble() or as_grouped_df() as appropriate, and use it internally.

@hadley: Should as_tibble() drop groups from a grouped d.f.? Haven't found advice in adv-r.

@krlmlr

This comment has been minimized.

krlmlr commented Nov 10, 2018

Another way is to apply group_by(add = TRUE):

library(tidyverse)
data.frame(a = 1) %>% group_by(add = TRUE)
#> # A tibble: 1 x 1
#>       a
#>   <dbl>
#> 1     1
tibble(a = 1) %>% group_by(add = TRUE)
#> # A tibble: 1 x 1
#>       a
#>   <dbl>
#> 1     1
tibble(a = 1) %>% group_by(a) %>% group_by(add = TRUE)
#> # A tibble: 1 x 1
#> # Groups:   a [1]
#>       a
#>   <dbl>
#> 1     1

Created on 2018-11-10 by the reprex package (v0.2.1)

@hadley

This comment has been minimized.

hadley commented Nov 10, 2018

I think as_tibble() must reduce to a tibble, removing any extraneous attributes. If you want a tibble or subclass, you need to write:

if (!is_tibble(x)) {
  x <- as_tibble(x)
}
@kriemo

This comment has been minimized.

Collaborator

kriemo commented Nov 12, 2018

Thanks for clarifying the behavior of as_tibble(). I will use the suggested check to fix this issue.

I'm currently working on fixing a ci build issue, once this is resolved I will merge in these changes and notify you.

jayhesselberth added a commit that referenced this issue Nov 13, 2018

@kriemo kriemo referenced this issue Nov 13, 2018

Closed

Release valr 0.4.2 #340

11 of 15 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment