New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Altvec objects in vec_slice #696
Conversation
1116bf9
to
19d52f3
Compare
Adding a test is unfortunately challenging, as AFAIK base R does not include an altrep class that has a custom extract_subset method. However I did confirm this works well with vroom altrep vectors and the devel versions of tibble and dplyr, which use library(vroom)
x <- vroom(vroom_example("mtcars.csv"), altrep = TRUE)[1:3]
#> Observations: 32
#> Variables: 12
#> chr [ 1]: model
#> dbl [11]: mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb
#>
#> Call `spec()` for a copy-pastable column specification
#> Specify the column types with `col_types` to quiet this message
vroom_str(x)
#> 'tbl_df', 'tbl', and 'data.frame': 32 obs., 3 vars.:
#> $model: altrep:true type:vroom::vroom_chr length:32 materialized:false
#> $mpg: altrep:true type:vroom::vroom_dbl length:32 materialized:false
#> $cyl: altrep:true type:vroom::vroom_dbl length:32 materialized:false
# calling directly
vroom_str(vctrs::vec_slice(x[[1]], 1:5))
#> altrep:true type:vroom::vroom_chr length:5 materialized:false
# implicitly by dev tibble
vroom_str(x[x$cyl == 4, ])
#> 'tbl_df', 'tbl', and 'data.frame': 11 obs., 3 vars.:
#> $model: altrep:true type:vroom::vroom_chr length:11 materialized:false
#> $mpg: altrep:true type:vroom::vroom_dbl length:11 materialized:false
#> $cyl: altrep:false type: double length:11
# implicitly by dev dplyr
vroom_str(dplyr::filter(x, cyl == 4))
#> 'tbl_df', 'tbl', and 'data.frame': 11 obs., 3 vars.:
#> $model: altrep:true type:vroom::vroom_chr length:11 materialized:false
#> $mpg: altrep:true type:vroom::vroom_dbl length:11 materialized:false
#> $cyl: altrep:false type: double length:11 Created on 2019-12-06 by the reprex package (v0.3.0) Session infodevtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 3.6.1 (2019-07-05)
#> os macOS Mojave 10.14.6
#> system x86_64, darwin15.6.0
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz America/New_York
#> date 2019-12-06
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.0)
#> backports 1.1.5 2019-10-02 [1] CRAN (R 3.6.0)
#> callr 3.3.2 2019-09-22 [1] CRAN (R 3.6.0)
#> cli 1.1.0 2019-03-19 [1] CRAN (R 3.6.0)
#> crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.0)
#> desc 1.2.0 2018-05-01 [1] CRAN (R 3.6.0)
#> devtools 2.2.1.9000 2019-10-03 [1] local
#> digest 0.6.23 2019-11-23 [1] CRAN (R 3.6.0)
#> dplyr 0.8.99.9000 2019-12-06 [1] Github (tidyverse/dplyr@4304c13)
#> ellipsis 0.3.0.9000 2019-12-02 [1] Github (r-lib/ellipsis@6f84df4)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.0)
#> fs 1.3.1 2019-05-06 [1] CRAN (R 3.6.0)
#> glue 1.3.1 2019-03-12 [1] CRAN (R 3.6.0)
#> highr 0.8 2019-03-20 [1] CRAN (R 3.6.0)
#> htmltools 0.4.0 2019-10-04 [1] CRAN (R 3.6.0)
#> knitr 1.26 2019-11-12 [1] CRAN (R 3.6.0)
#> lobstr 1.1.1 2019-07-02 [1] CRAN (R 3.6.0)
#> magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.0)
#> memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.0)
#> nvimcom * 0.9-75 2019-10-16 [1] local
#> pillar 1.4.2.9001 2019-12-06 [1] Github (r-lib/pillar@82370d7)
#> pkgbuild 1.0.6 2019-10-09 [1] CRAN (R 3.6.0)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 3.6.0)
#> pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.6.0)
#> prettyunits 1.0.2 2015-07-13 [1] CRAN (R 3.6.0)
#> processx 3.4.1 2019-07-18 [1] CRAN (R 3.6.0)
#> ps 1.3.0 2018-12-21 [1] CRAN (R 3.6.0)
#> purrr 0.3.3 2019-10-18 [1] CRAN (R 3.6.0)
#> R6 2.4.1 2019-11-12 [1] CRAN (R 3.6.0)
#> Rcpp 1.0.3 2019-11-08 [1] CRAN (R 3.6.0)
#> remotes 2.1.0 2019-06-24 [1] CRAN (R 3.6.0)
#> rlang 0.4.2 2019-11-23 [1] CRAN (R 3.6.0)
#> rmarkdown 1.18 2019-11-27 [1] CRAN (R 3.6.1)
#> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.0)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.0)
#> stringi 1.4.3 2019-03-12 [1] CRAN (R 3.6.0)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 3.6.0)
#> testthat 2.3.1 2019-12-01 [1] CRAN (R 3.6.1)
#> tibble 2.99.99.9010 2019-12-06 [1] Github (tidyverse/tibble@f4365f7)
#> tidyselect 0.2.5 2018-10-11 [1] CRAN (R 3.6.0)
#> usethis 1.5.1.9000 2019-12-02 [1] Github (r-lib/usethis@c7314cf)
#> vctrs 0.2.0.9007 2019-12-06 [1] local
#> vroom * 1.0.2 2019-06-28 [1] CRAN (R 3.6.0)
#> withr 2.1.2 2018-03-15 [1] CRAN (R 3.6.0)
#> xfun 0.11 2019-11-12 [1] CRAN (R 3.6.0)
#> yaml 2.2.0 2018-07-25 [1] CRAN (R 3.6.0)
#> zeallot 0.1.0 2018-01-28 [1] CRAN (R 3.6.0)
#>
#> [1] /Users/jhester/Library/R/3.6/library
#> [2] /Library/Frameworks/R.framework/Versions/3.6/Resources/library |
This is failing on linux because |
Ok I have added a workaround by defining a proxy I may also post on R-devel asking why this function in particular is hidden, when others are not. |
Do we need checks against upstream changes to the header? I guess they can't really make changes to the struct because that would break ABI compatibility (across versions but also with serialised objects)? |
They might be able to add fields to the struct without breaking compatibility, but even if they do this code should still work. It would be safest to have a check of some sort, but I am not sure what we could do, did you have something in mind? |
I agree the code should be forward-compatible if they only add fields to the method table struct. I was thinking about creating a small altrep class and check that it's sliced properly. But eventually we'll have altrep sequences, repetitions, and RLE vectors in the package, so slicing of altrep vectors should be covered by unit tests sooner or later. No worries if you don't have time to write a test class. |
I have a RLE implementation in vroom that I could move over here, I have to convert it to C, but all the logic is already C anyway, so it shouldn't take too long. |
oh that'd be great! It'd give us a good basis for experimenting with RLE classes (#332). |
This dispatches to ALTVEC_EXTRACT_SUBSET for altvec objects, falling back to the normal slicing code if the Altvec object does not declare a extract_subset method. For versions of R prior to the introduction of Altvec (< 3.5.0) stub macros are defined resulting in a dead conditional, which should be removed by dead code optimizer in the compiler.
Since this method is hidden on linux :/
Use this class to test slicing of altrep objects
2ae189a
to
ad20acc
Compare
Ok I added the RLE I used in vroom, the constructor takes a named integer vector to create a RLE character vector. e.g. .Call(vctrs_rle, c(foo = 10L, bar = 5L)) Will create a length 15 character vector, with the first 10 elements as foo and the last 5 as bar. The subset method is not very sophisticated, it just returns a normal vector for the given indices. One could imagine a subset that returned a new RLE, but I think this is enough for the test. |
Thanks! |
This dispatches to ALTVEC_EXTRACT_SUBSET for altvec objects,
falling back to the normal slicing code if the Altvec object does not
declare a extract_subset method.
For versions of R prior to the introduction of Altvec (< 3.5.0) stub
macros are defined resulting in a dead conditional, which should be
removed by dead code optimizer in the compiler.