Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Print Option: Print columns that fit in single console without wrapping #4074

Merged
merged 40 commits into from Dec 18, 2019
Merged

Print Option: Print columns that fit in single console without wrapping #4074

merged 40 commits into from Dec 18, 2019

Conversation

TysonStanley
Copy link
Member

Closes #1497

This PR addresses @MichaelChirico's number 11 and, somewhat, number 16 on #1523 by providing a trunc.cols argument (with corresponding default option("datatable.print.trunc.cols")). For example:

options("width" = 40)
DT <- data.table(thing_11 = vector("integer", 3),
                 thing_21 = vector("complex", 3),
                 thing_31 = as.IDate(paste0("2016-02-0", 1:3)),
                 thing_41 = "aasdfasdfasdfasdfasdfasdfasdfasdfasdfasdf",
                 thing_51 = vector("integer", 3),
                 thing_61 = vector("complex", 3))
print(DT, trunc.cols=TRUE)
#>    thing_11 thing_21   thing_31
#> 1:        0     0+0i 2016-02-01
#> 2:        0     0+0i 2016-02-02
#> 3:        0     0+0i 2016-02-03
#> Variables not shown: thing_41, thing_51, thing_61.

Some possible ideas for adjustments:

  • Include variable type in the "Variables shown..." message
  • Instead of going in the order of the columns, we could go with the data.table usual of showing the first and last columns.

@codecov
Copy link

codecov bot commented Nov 23, 2019

Codecov Report

Merging #4074 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #4074      +/-   ##
=========================================
+ Coverage    99.4%   99.4%   +<.01%     
=========================================
  Files          72      72              
  Lines       13675   13710      +35     
=========================================
+ Hits        13594   13629      +35     
  Misses         81      81
Impacted Files Coverage Δ
R/print.data.table.R 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6f17d18...020d56f. Read the comment docs.

Copy link
Member

@jangorecki jangorecki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about some unit tests?

man/print.data.table.Rd Outdated Show resolved Hide resolved
@TysonStanley
Copy link
Member Author

Added unit tests, fixed an issue that appeared with one row data tables when trunc.cols = TRUE, and updated the example with the suggestions from @jangorecki.

Should I add info to NEWS.md?

Also, regarding the classes, I was thinking we could add the classes to the "Variables not shown" message when class = TRUE so it shows up in the table and the message. Would it be worth going forward with this?

Copy link
Member

@jangorecki jangorecki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MichaelChirico could you take a look? you are more into printing options features.

@TysonStanley
Copy link
Member Author

The failed test looks like it just needs to be re-run (it looks like a LaTeX error).

I added the classes to the message about not printed columns when class=TRUE, added a message to NEWS about the changes, and updated the tests. @MichaelChirico do you have anything else you’d like me to do for this PR?

@MichaelChirico
Copy link
Member

@TysonStanley please see the inline comments I added, still a few things to handle. Thanks!

@TysonStanley
Copy link
Member Author

@MichaelChirico I’m not seeing the inline comments showing up. Am I missing it? Also, I like the updated NEWS.

@MichaelChirico
Copy link
Member

@TysonStanley do you see this and several others on the Files Changed tab?

https://github.com/Rdatatable/data.table/pull/4074/files

Screenshot 2019-11-28 at 7 50 10 AM

@TysonStanley
Copy link
Member Author

@MichaelChirico no, I looked there and couldn’t see any... All I see are the changes to the files (no comments at all).

R/print.data.table.R Outdated Show resolved Hide resolved
R/print.data.table.R Outdated Show resolved Hide resolved
R/print.data.table.R Outdated Show resolved Hide resolved
R/print.data.table.R Outdated Show resolved Hide resolved
R/print.data.table.R Outdated Show resolved Hide resolved
R/print.data.table.R Outdated Show resolved Hide resolved
R/print.data.table.R Show resolved Hide resolved
R/print.data.table.R Outdated Show resolved Hide resolved
names = sapply(names, nchar_width)
dt_widths = ifelse(widths > names, widths, names)
rownum_width = if (row.names) nchar_width(nrow(x)) else 0
cumsum(dt_widths + 1) + rownum_width + 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will print columns 1, 2, ..., n where n is the last column before crossing options('width') right?

pandas is doing a bracketing approach (columns (1, N), (2, N-1), ..., (n, N-n)) where the (n+1, N-n-1) pair would break the options('width') threshold.

I'm not married to one way or the other, just worth considering which we think is the best fit. Are users likely to prefer seeing the first few columns, or seeing the first + last few? I'm not sure

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are correct. My personal preference is the way I've done it but that, by no means, is a good reason. What do you suggest? Would having another option be the fix here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should pick one and stick with it rather than add more options. We can play around with it in dev for a while & maybe change later.

Just some more thoughts: I guess most common is to have the first few columns with keys/"ID"-style rows. Then have "data" in the "rightmost" columns. So the tradeoff is -- showing all the key columns, vs showing some keys, some data. Still don't have a feel yet for which is more helpful to see in a preview.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I like the idea of differentiating between the keys and "data" in the preview. Definitely could get to some weird edge cases with this, depending on the console width and the number of keys...

@MichaelChirico
Copy link
Member

MichaelChirico commented Nov 28, 2019

@TysonStanley aaahhh sorry. I forgot to hit submit on my review. Hence the Pending in my screenshot. Should be visible now.

@TysonStanley
Copy link
Member Author

TysonStanley commented Nov 28, 2019

The "only" is in the wrong place in that commit message. It should say: "prints only the message when console too small to fit the first column". E.g.,

options("width" = 10)
DT = data.table(a = "aaaaaaaaaaaaa",
                b = "bbbbbbbbbbbbb",
                c = "ccccccccccccc",
                d = "ddddddddddddd")
print(DT, trunc.cols=TRUE)
#> 4 variables not shown: a <char>, b <char>, c <char>, d <char> 

@TysonStanley
Copy link
Member Author

TysonStanley commented Nov 28, 2019

This one adds the brackets to the tests. I like how those brackets look.

@TysonStanley
Copy link
Member Author

@MichaelChirico Alright, checks look good so I won't make any other changes unless you ask me to.

@MichaelChirico
Copy link
Member

Looks good to me! Thanks for your diligence!!

@mattdowle mattdowle added this to the 1.12.9 milestone Dec 18, 2019
@mattdowle
Copy link
Member

Excellent. @TysonStanley I've invited you to be project member. You'll need to accept the invitation displayed in your GitHub profile or GitHub repositories page. Then you can create branches directly in the main project. Welcome!
I'll merge as-is. Then I'll move the news item up and bump the test numbers in a follow-up commit. (Quicker for me that way than bringing the fork up to date.)

@mattdowle mattdowle merged commit 833702a into Rdatatable:master Dec 18, 2019
mattdowle added a commit that referenced this pull request Dec 18, 2019
@jangorecki jangorecki modified the milestones: 1.12.11, 1.12.9 May 26, 2020
@jangorecki
Copy link
Member

jangorecki commented Oct 17, 2020

@TysonStanley There should be better validation of argument values provided, for example following error message is very confusing.

> print(as.data.table(iris), trunc.cols=5L)
     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
            <num>       <num>        <num>       <num>    <fctr>
  1:          5.1         3.5          1.4         0.2    setosa
  2:          4.9         3.0          1.4         0.2    setosa
  3:          4.7         3.2          1.3         0.2    setosa
  4:          4.6         3.1          1.5         0.2    setosa
  5:          5.0         3.6          1.4         0.2    setosa
 ---                                                            
146:          6.7         3.0          5.2         2.3 virginica
147:          6.3         2.5          5.0         1.9 virginica
148:          6.5         3.0          5.2         2.0 virginica
149:          6.2         3.4          5.4         2.3 virginica
150:          5.9         3.0          5.1         1.8 virginica
Error in print.data.table(as.data.table(iris), trunc.cols = 5L) : 
  object 'not_printed' not found

@TysonStanley
Copy link
Member Author

Just made a PR for this #4766 that adds a quick check for a logical trunc.cols argument at the top of print.data.table().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Request] head.data.table print columns fit to screen width
4 participants