New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[R] Simplify dataset and table print output #38916
Comments
thisisnic
changed the title
[R] Simplify dataset print output
[R] Simplify dataset and table print output
Nov 28, 2023
thisisnic
added a commit
that referenced
this issue
Mar 13, 2024
### Rationale for this change When printing objects with data with lots of rows, the output is long and unwieldy. ### What changes are included in this PR? * Truncates long schema print output and adds the number of columns to dataset print output. * Add number of columns to output so it's clear how many there are in total ### Are these changes tested? Yes ### Are there any user-facing changes? Yes Before: ``` r library(arrow) x <- tibble::tibble(!!!letters, .rows = 5) InMemoryDataset$create(x) #> InMemoryDataset #> "a": string #> "b": string #> "c": string #> "d": string #> "e": string #> "f": string #> "g": string #> "h": string #> "i": string #> "j": string #> "k": string #> "l": string #> "m": string #> "n": string #> "o": string #> "p": string #> "q": string #> "r": string #> "s": string #> "t": string #> "u": string #> "v": string #> "w": string #> "x": string #> "y": string #> "z": string arrow_table(x) #> Table #> 5 rows x 26 columns #> $"a" <string> #> $"b" <string> #> $"c" <string> #> $"d" <string> #> $"e" <string> #> $"f" <string> #> $"g" <string> #> $"h" <string> #> $"i" <string> #> $"j" <string> #> $"k" <string> #> $"l" <string> #> $"m" <string> #> $"n" <string> #> $"o" <string> #> $"p" <string> #> $"q" <string> #> $"r" <string> #> $"s" <string> #> $"t" <string> #> $"u" <string> #> $"v" <string> #> $"w" <string> #> $"x" <string> #> $"y" <string> #> $"z" <string> record_batch(x) #> RecordBatch #> 5 rows x 26 columns #> $"a" <string> #> $"b" <string> #> $"c" <string> #> $"d" <string> #> $"e" <string> #> $"f" <string> #> $"g" <string> #> $"h" <string> #> $"i" <string> #> $"j" <string> #> $"k" <string> #> $"l" <string> #> $"m" <string> #> $"n" <string> #> $"o" <string> #> $"p" <string> #> $"q" <string> #> $"r" <string> #> $"s" <string> #> $"t" <string> #> $"u" <string> #> $"v" <string> #> $"w" <string> #> $"x" <string> #> $"y" <string> #> $"z" <string> ``` After: ``` r library(arrow) x <- tibble::tibble(!!!letters, .rows = 5) InMemoryDataset$create(x) #> InMemoryDataset #> 26 columns #> "a": string #> "b": string #> "c": string #> "d": string #> "e": string #> "f": string #> "g": string #> "h": string #> "i": string #> "j": string #> "k": string #> "l": string #> "m": string #> "n": string #> "o": string #> "p": string #> "q": string #> "r": string #> "s": string #> "t": string #> ... #> Use `schema()` to see entire schema arrow_table(x) #> Table #> 5 rows x 26 columns #> $"a" <string> #> $"b" <string> #> $"c" <string> #> $"d" <string> #> $"e" <string> #> $"f" <string> #> $"g" <string> #> $"h" <string> #> $"i" <string> #> $"j" <string> #> $"k" <string> #> $"l" <string> #> $"m" <string> #> $"n" <string> #> $"o" <string> #> $"p" <string> #> $"q" <string> #> $"r" <string> #> $"s" <string> #> $"t" <string> #> ... #> Use `schema()` to see entire schema record_batch(x) #> RecordBatch #> 5 rows x 26 columns #> $"a" <string> #> $"b" <string> #> $"c" <string> #> $"d" <string> #> $"e" <string> #> $"f" <string> #> $"g" <string> #> $"h" <string> #> $"i" <string> #> $"j" <string> #> $"k" <string> #> $"l" <string> #> $"m" <string> #> $"n" <string> #> $"o" <string> #> $"p" <string> #> $"q" <string> #> $"r" <string> #> $"s" <string> #> $"t" <string> #> ... #> Use `schema()` to see entire schema ``` * Closes: #38916 Lead-authored-by: Nic Crane <thisisnic@gmail.com> Co-authored-by: Bryce Mecum <petridish@gmail.com> Signed-off-by: Nic Crane <thisisnic@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the enhancement requested
When we print a dataset, we get a short description of the dataset and then the full schema with one column on each line. This looks fine for datasets with few columns, but can grow unwieldy and messy. An example from a recent dataset I've been working with:
We could do something like the tibble preview, with instructions to call
schema()
to view the full schema. The tibble preview for the dataset, for contrast:Component(s)
R
The text was updated successfully, but these errors were encountered: