Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nicer printing of list columns #33

Closed
jennybc opened this issue Mar 3, 2016 · 4 comments
Closed

nicer printing of list columns #33

jennybc opened this issue Mar 3, 2016 · 4 comments
Milestone

Comments

@jennybc
Copy link
Member

jennybc commented Mar 3, 2016

Seems like we will have more exotic objects in tbl_dfs in the near future. This poses a printing challenge. And whatever RStudio is doing in View() seems like a good idea. Here are two views of a tbl_df that has a bunch of tweets in it, stored as S4 status objects from the twitteR package. Could the regular print method behave more like View() and show less, to reduce the risk of obscuring other variables? Somewhat related to a question I posed on R-help and SO earlier this year.

screen shot 2016-03-03 at 11 05 47 am

screen shot 2016-03-03 at 11 06 19 am

@krlmlr
Copy link
Member

krlmlr commented Mar 8, 2016

We should definitely improve here. Copied reprex from your SO question:

library(Matrix)
library(dplyr)
m <- new("dgCMatrix")
isS4(m)
#> [1] TRUE
df <- data.frame(id = 1:2)
df$matrices <- list(m, m)
df
#> Error in prettyNum(.Internal(format(x, trim, digits, nsmall, width, 3L, : first argument must be atomic
tbl_df(df)
#> Source: local data frame [2 x 2]
#> 
#>      id
#>   (int)
#> 1     1
#> 2     2
#> Variables not shown: matrices (list).

## force dplyr to show the tricky column
tbl_df(select(df, matrices))
#> Source: local data frame [2 x 1]
#> 
#>                                                                      matrices
#>                                                                        (list)
#> 1 <S4:dgCMatrix, CsparseMatrix, dsparseMatrix, generalMatrix, dCsparseMatrix,
#> 2 <S4:dgCMatrix, CsparseMatrix, dsparseMatrix, generalMatrix, dCsparseMatrix,

@krlmlr krlmlr added this to the 2.0 milestone Mar 8, 2016
@hadley hadley closed this as completed in 6cfb350 Mar 17, 2016
@hadley
Copy link
Member

hadley commented Mar 17, 2016

@krlmlr I think I remember that you like to generate the NEWS from the git commits, so I didn't add a news entry.

krlmlr pushed a commit that referenced this issue Mar 18, 2016
- `[[.tbl_df()` now falls back to regular subsetting when used with anything other than a single string (#29).
- When used in list-columns, S4 objects only print the class name rather than the full class hierarchy (#33).
- Further cleanup for `repair_names()`.
- Add test that `[.tbl_df()` does not change class (#41, @jennybc).
@krlmlr
Copy link
Member

krlmlr commented Mar 18, 2016

Thanks. The NEWS are initialized from the bodies of the commit messages in "master" (4b6a6d1), the header is ignored. I can edit them afterwards (3cc26eb), often this isn't even necessary.

krlmlr pushed a commit that referenced this issue Mar 21, 2016
- Interface changes
    - `glimpse()` obtains default width from `tibble.width` option (#35, #56).
    - Don't export `dim_desc()` (#50, #55).
    - New `has_rownames()` and `remove_rownames()` (#44).
- Minor modifications
    - `frame_data()` returns 0-row but n-col data frame if no data.
    - `[[.tbl_df()` now falls back to regular subsetting when used with anything other than a single string (#29).
    - When used in list-columns, S4 objects only print the class name rather than the full class hierarchy (#33).
    - Add test that `[.tbl_df()` does not change class (#41, @jennybc).
    - Improve `[.tbl_df()` error message.
- Documentation
    - Improve documentation and vignette.
    - Update README, with edits (#52, @bhive01) and enhancements (#54, @jennybc).
- Code quality
    - Full test coverage (#24, #53).
    - Renamed `obj_type()` to `obj_sum()`, improvements, better integration with `type_sum()`.
    - Cleanup for `column_to_rownames()` and `rownames_to_columns()` (#45).
    - Cleanup for `repair_names()` (#43). Whitespace are not touched by this function (#47).
    - Cleanup for `add_row()` (#46).
    - Regression tests load known output from file (#49).
    - Internal cleanup.
    - Don't test R-devel on AppVeyor because of missing directory on CRAN.
krlmlr pushed a commit that referenced this issue Mar 22, 2016
- Initial CRAN release

- Extracted from `dplyr` 0.4.3

- Exported functions:
    - `tbl_df()`
    - `as_data_frame()`
    - `data_frame()`, `data_frame_()`
    - `frame_data()`, `tibble()`
    - `glimpse()`
    - `trunc_mat()`, `knit_print.trunc_mat()`
    - `type_sum()`
    - New `lst()` and `lst_()` create lists in the same way that
      `data_frame()` and `data_frame_()` create data frames (tidyverse/dplyr#1290).
      `lst(NULL)` doesn't raise an error (#17, @jennybc), but always
      uses deparsed expression as name (even for `NULL`).
    - New `add_row()` makes it easy to add a new row to data frame
      (tidyverse/dplyr#1021).
    - New `rownames_to_column()` and `column_to_rownames()` (#11, @zhilongjia).
    - New `has_rownames()` and `remove_rownames()` (#44).
    - New `repair_names()` fixes missing and duplicate names (#10, #15,
      @r2evans).
    - New `is_vector_s3()`.

- Features
    - New `as_data_frame.table()` with argument `n` to control name of count
      column (#22, #23).
    - Use `tibble` prefix for options (#13, #36).
    - `glimpse()` now (invisibly) returns its argument (tidyverse/dplyr#1570). It
      is now a generic, the default method dispatches to `str()`
      (tidyverse/dplyr#1325).  The default width is obtained from the
      `tibble.width` option (#35, #56).
    - `as_data_frame()` is now an S3 generic with methods for lists (the old
      `as_data_frame()`), data frames (trivial), matrices (with efficient
      C++ implementation) (tidyverse/dplyr#876), and `NULL` (returns a 0-row
      0-column data frame) (#17, @jennybc).
    - Non-scalar input to `frame_data()` and `tibble()` (including lists)
      creates list-valued columns (#7). These functions return 0-row but n-col
      data frame if no data.

- Bug fixes
    - `frame_data()` properly constructs rectangular tables (tidyverse/dplyr#1377,
      @kevinushey).

- Minor modifications
    - Uses `setOldClass(c("tbl_df", "tbl", "data.frame"))` to help with S4
      (tidyverse/dplyr#969).
    - `tbl_df()` automatically generates column names (tidyverse/dplyr#1606).
    - `tbl_df`s gain `$` and `[[` methods that are ~5x faster than the defaults,
      never do partial matching (tidyverse/dplyr#1504), and throw an error if the
      variable does not exist.  `[[.tbl_df()` falls back to regular subsetting
      when used with anything other than a single string (#29).
      `base::getElement()` now works with tibbles (#9).
    - `all_equal()` allows to compare data frames ignoring row and column order,
      and optionally ignoring minor differences in type (e.g. int vs. double)
      (tidyverse/dplyr#821).  Used by `all.equal()` for tibbles.  (This package
      contains a pure R implementation of `all_equal()`, the `dplyr` code has
      identical behavior but is written in C++ and thus faster.)
    - The internals of `data_frame()` and `as_data_frame()` have been aligned,
      so `as_data_frame()` will now automatically recycle length-1 vectors.
      Both functions give more informative error messages if you are attempting
      to create an invalid data frame.  You can no longer create a data frame
      with duplicated names (tidyverse/dplyr#820).  Both functions now check that
      you don't have any `POSIXlt` columns, and tell you to use `POSIXct` if you
      do (tidyverse/dplyr#813).  `data_frame(NULL)` raises error "must be a 1d
      atomic vector or list".
    - `trunc_mat()` and `print.tbl_df()` are considerably faster if you have
      very wide data frames.  They will now also only list the first 100
      additional variables not already on screen - control this with the new
      `n_extra` parameter to `print()` (tidyverse/dplyr#1161).  The type of list
      columns is printed correctly (tidyverse/dplyr#1379).  The `width` argument is
      used also for 0-row or 0-column data frames (#18).
    - When used in list-columns, S4 objects only print the class name rather
      than the full class hierarchy (#33).
    - Add test that `[.tbl_df()` does not change class (#41, @jennybc).  Improve
      `[.tbl_df()` error message.

- Documentation
    - Update README, with edits (#52, @bhive01) and enhancements (#54,
      @jennybc).
    - `vignette("tibble")` describes the difference between tbl_dfs and
      regular data frames (tidyverse/dplyr#1468).

- Code quality
    - Test using new-style Travis-CI and AppVeyor. Full test coverage (#24,
      #53). Regression tests load known output from file (#49).
    - Renamed `obj_type()` to `obj_sum()`, improvements, better integration with
     `type_sum()`.
    - Internal cleanup.
@github-actions
Copy link
Contributor

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.

@github-actions github-actions bot locked and limited conversation to collaborators Dec 16, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants