New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

as_data_frame.matrix can't handle subclassed matrices #110

Closed
tjmahr opened this Issue Jun 28, 2016 · 5 comments

Comments

Projects
None yet
3 participants
@tjmahr

tjmahr commented Jun 28, 2016

I was writing a tidier function with as_data_frame to convert a matrix generated by a model comparison function. The conversion appeared to work until tried to mutate the tibble. Then I got the error Error: matrix as column is not supported. I was able to trace the problem to how the columns in tibble retained their "matrix" class.

Here is an example using stats::poly, which also produces a subclassed matrix.

I decided to file an issue because this behavior is unexpected and because as.data.frame does work as expected.

library("dplyr", warn.conflicts = FALSE)
#> Warning: package 'dplyr' was built under R version 3.3.1
library("tibble")

poly(1:6, 3) %>% str
#>  poly [1:6, 1:3] -0.598 -0.359 -0.12 0.12 0.359 ...
#>  - attr(*, "dimnames")=List of 2
#>   ..$ : NULL
#>   ..$ : chr [1:3] "1" "2" "3"
#>  - attr(*, "coefs")=List of 2
#>   ..$ alpha: num [1:3] 3.5 3.5 3.5
#>   ..$ norm2: num [1:5] 1 6 17.5 37.3 64.8
#>  - attr(*, "degree")= int [1:3] 1 2 3
#>  - attr(*, "class")= chr [1:2] "poly" "matrix"

poly(1:6, 3) %>% as_data_frame %>% mutate(Condition = "Test")
#> Error in eval(expr, envir, enclos): matrix as column is not supported

poly(1:6, 3) %>% as_data_frame %>% str
#> Classes 'tbl_df', 'tbl' and 'data.frame':    6 obs. of  3 variables:
#>  $ 1:Classes 'poly', 'matrix'  atomic [1:6] -0.598 -0.359 -0.12 0.12 0.359 ...
#>   .. ..- attr(*, "coefs")=List of 2
#>   .. .. ..$ alpha: num [1:3] 3.5 3.5 3.5
#>   .. .. ..$ norm2: num [1:5] 1 6 17.5 37.3 64.8
#>   .. ..- attr(*, "degree")= int [1:3] 1 2 3
#>  $ 2:Classes 'poly', 'matrix'  atomic [1:6] 0.546 -0.109 -0.436 -0.436 -0.109 ...
#>   .. ..- attr(*, "coefs")=List of 2
#>   .. .. ..$ alpha: num [1:3] 3.5 3.5 3.5
#>   .. .. ..$ norm2: num [1:5] 1 6 17.5 37.3 64.8
#>   .. ..- attr(*, "degree")= int [1:3] 1 2 3
#>  $ 3:Classes 'poly', 'matrix'  atomic [1:6] -0.373 0.522 0.298 -0.298 -0.522 ...
#>   .. ..- attr(*, "coefs")=List of 2
#>   .. .. ..$ alpha: num [1:3] 3.5 3.5 3.5
#>   .. .. ..$ norm2: num [1:5] 1 6 17.5 37.3 64.8
#>   .. ..- attr(*, "degree")= int [1:3] 1 2 3

# workaround
poly(1:6, 3) %>% as.data.frame %>% as_data_frame %>% mutate(Condition = "Test")
#> # A tibble: 6 x 4
#>            1          2          3 Condition
#>        <dbl>      <dbl>      <dbl>     <chr>
#> 1 -0.5976143  0.5455447 -0.3726780      Test
#> 2 -0.3585686 -0.1091089  0.5217492      Test
#> 3 -0.1195229 -0.4364358  0.2981424      Test
#> 4  0.1195229 -0.4364358 -0.2981424      Test
#> 5  0.3585686 -0.1091089 -0.5217492      Test
#> 6  0.5976143  0.5455447  0.3726780      Test

devtools::session_info()
#> Session info --------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.3.0 (2016-05-03)
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  English_United States.1252  
#>  tz       America/Chicago             
#>  date     2016-06-28
#> Packages ------------------------------------------------------------------
#>  package    * version date       source                        
#>  assertthat   0.1     2013-12-06 CRAN (R 3.0.2)                
#>  DBI          0.4-1   2016-05-08 CRAN (R 3.2.5)                
#>  devtools     1.12.0  2016-06-24 CRAN (R 3.3.1)                
#>  digest       0.6.9   2016-01-08 CRAN (R 3.2.2)                
#>  dplyr      * 0.5.0   2016-06-24 CRAN (R 3.3.1)                
#>  evaluate     0.9     2016-04-29 CRAN (R 3.2.5)                
#>  formatR      1.4     2016-05-09 CRAN (R 3.2.3)                
#>  htmltools    0.3.5   2016-03-21 CRAN (R 3.2.4)                
#>  knitr        1.13    2016-05-09 CRAN (R 3.2.3)                
#>  lazyeval     0.2.0   2016-06-12 CRAN (R 3.3.0)                
#>  magrittr     1.5     2014-11-22 CRAN (R 3.1.2)                
#>  memoise      1.0.0   2016-01-29 CRAN (R 3.2.3)                
#>  R6           2.1.2   2016-01-26 CRAN (R 3.2.3)                
#>  Rcpp         0.12.5  2016-05-14 CRAN (R 3.2.5)                
#>  rmarkdown    0.9.6   2016-05-01 CRAN (R 3.2.3)                
#>  stringi      1.1.1   2016-05-27 CRAN (R 3.2.5)                
#>  stringr      1.0.0   2015-04-30 CRAN (R 3.2.0)                
#>  tibble     * 1.0-12  2016-06-28 Github (hadley/tibble@1e5b140)
#>  withr        1.0.2   2016-06-20 CRAN (R 3.3.1)                
#>  yaml         2.1.13  2014-06-12 CRAN (R 3.1.0)
@krlmlr

This comment has been minimized.

Member

krlmlr commented Jul 30, 2016

Each column gets a copy of the input object's attributes, including class: https://github.com/hadley/tibble/blob/6340652ee468bffcffa8180234d00f282a0ec55e/src/matrixToDataFrame.cpp#L21. This is necessary e.g. to support conversion of Date matrices, but unhelpful in the case shown here.

@krlmlr krlmlr self-assigned this Jul 30, 2016

@krlmlr krlmlr added the in progress label Jul 30, 2016

@krlmlr krlmlr closed this in 8fe680c Jul 30, 2016

krlmlr pushed a commit that referenced this issue Jul 30, 2016

Kirill Müller
Merge tag 'v1.1-4'
- `as_tibble.matrix()` doesn't add the `class` attribute of the original matrix to the columns of the new data frame. A test had to be adapted for this, but it used a matrix of `Date` objects which don't seem to be that useful in R (#110).

@krlmlr krlmlr removed the in progress label Jul 30, 2016

@krlmlr krlmlr reopened this Aug 18, 2016

@krlmlr krlmlr added this to the 1.2 milestone Aug 18, 2016

@krlmlr

This comment has been minimized.

Member

krlmlr commented Aug 18, 2016

A factor (or otherwise classed) matrix can be created via dim<-:

a <- factor(letters)
dim(a) <- c(13, 2)
class(a)
is.matrix(a)

Turns out the poly class is a very special case. I'll simply implement as_data_frame.poly().

@krlmlr

This comment has been minimized.

Member

krlmlr commented Aug 18, 2016

@hadley: Can you think of other common matrix subclasses that may require special treatment here?

@hadley

This comment has been minimized.

Member

hadley commented Aug 18, 2016

The most important would be table, but I think you already handle that. Maybe ftable? (But that's v. low priority)

krlmlr added a commit that referenced this issue Aug 18, 2016

Merge branch 'b-#154-tidyr'. Fixes #154.
- `as_tibble.matrix()` doesn't remove the `"class"` attribute anymore, to support (again) conversion of `factor` and `Date` matrices (#110, #154).
- New `as_tibble.poly()` to support conversion of a `poly` object to a tibble.
@krlmlr

This comment has been minimized.

Member

krlmlr commented Aug 18, 2016

Thanks. These are handled via the as.data.frame() forward. Not adding a test for now.

@krlmlr krlmlr closed this Aug 18, 2016

krlmlr added a commit that referenced this issue Aug 18, 2016

Merge tag 'v1.1-8'
- The `tibble.width` option is used for `glimpse()` only if it is finite (#153).
- Add guidance to install `nycflights13` package to examples (#152).
- New object summary vignette that shows which methods to define for custom vector classes to be used as tibble columns (#151).
- `as_tibble.matrix()` doesn't remove the `"class"` attribute anymore, to support (again) conversion of `factor` and `Date` matrices (#110, #154).
- New `as_tibble.poly()` to support conversion of a `poly` object to a tibble.

krlmlr added a commit that referenced this issue Aug 26, 2016

Merge tag 'v1.2'
- The `tibble.width` option is used for `glimpse()` only if it is finite (#153, @kwstat).
- New `as_tibble.poly()` to support conversion of a `poly` object to a tibble (#110).
- `add_row()` now correctly handles existing columns of type `list` that are not updated (#148).
- `all.equal()` doesn't throw an error anymore if one of the columns is named `na.last`, `decreasing` or `method` (#107, @BillDunlap).

- New `add_column()`, analogously to `add_row()` (#99).
- `print.tbl_df()` gains `n_extra` method and will have the same interface as `trunc_mat()` from now on.
- `add_row()` and `add_column()` gain `.before` and `.after` arguments which indicate the row (by number) or column (by number or name) before or after which the new data are inserted. Updated or added columns cannot be named `.before` or `.after` (#99).
- Rename `frame_data()` to `tribble()`, stands for "transposed tibble". The former is still available as alias (#132, #143).

- `add_row()` now can add multiple rows, with recycling (#142, @jennybc).
- Use multiply character `×` instead of `x` when printing dimensions (#126). Output tests had to be disabled for this on Windows.
- Back-tick non-semantic column names on output (#131).
- Use `dttm` instead of `time` for `POSIXt` values (#133), which is now used for columns of the `difftime` class.
- Better output for 0-row results when total number of rows is unknown (e.g., for SQL data sources).

- New object summary vignette that shows which methods to define for custom vector classes to be used as tibble columns (#151).
- Added more examples for `print.tbl_df()`, now using data from `nycflights13` instead of `Lahman` (#121), with guidance to install `nycflights13` package if necessary (#152).
- Minor changes in vignette (#115, @helix123).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment