Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upVarious enhancements to print.data.table #1523
Comments
|
Just brilliant! |
|
No idea about 3 and 5 (as to what they mean). |
|
It'd be really nice if Github would allow assigning tasks to project who aren't necessarily members :-(. |
|
There's also #1497 |
|
@arunsrinivasan should I try and PR this one issue at a time? Or in a fell swoop? I've got 8 basically taken care of, just need to add tests. |
|
Michael, separate PRs. |
|
Very nice! Sorry to get back to you late on this, but Arun provided a nice example. It is just a nice convenience when interactively looking at tables with lots columns so your console isn't engulfed by a huge data dump when you take a look at the head. Ill close that other one. |
#1523 progress: adds option for dplyr-inspired column class summary with printing
|
It'd be also nice to print: primary key: by default. It's definitely informative to know what the keys and secondary indices are.. |
|
Also, I think this is better output for: print(DT, class=TRUE)
<char> <int> <num>
site date x
1: A 1 10
2: A 2 20
3: A 3 30
4: B 1 10
5: B 2 20
6: B 3 30It's easier to copy/paste the data.table without the classes in the way. If we can do that, we can turn on printing classes by default. Thoughts? |
|
@arunsrinivasan about printing keys:
About This can be done, but will require a step of wrangling -- basically |
|
About On printing keys:
primary key: <a, b> clearly tells the first key column is "a", then "b".. Does this clarify things a bit? |
|
I agree |
|
@arunsrinivasan OK, I think I can get on board with that. Can ditch point # 7 then. I agree distinguishing key order at a glance was going to be tough. So how about:
Lastly, I propose sending this output through |
|
My suggestion would be this:
Keys: <col1, col2> (only one) I don't mind "<>" being replaced with "" if that'd be more aesthetically pleasing.. e.g., "col1,col2", "col1" etc.. Last proposal: seems nice, but I wonder if it might create issues wth knitr when people suppress 'messages' in chunk.. and print the output? |
|
It'd be great to have this and class=TRUE default for v1.9.8 already.. we'll see. |
|
One other thought: Many people use "numeric" type when an integer type would suffice, and when "integer64" would fit the bill better. How about marking those columns somehow while printing? instead of , perhaps >num< ?? that'll allow people to be aware of such optimisations as well.. |
|
OR "!num!"? There's a function |
|
@arunsrinivasan Hmm I think it's definitely not something to be used as a part of Some initial musings:
Are you thinking of pushing 1.9.8 soon? Oh, one more thing, what do you think about porting |
|
Hm, yes, let's forget the marking of columns for now. On pushing 1.9.8: trying as much as possible to wrap the other issues marked as quick as possible. I'd like to work on non-equi joins for this release. On print.data.table to separate file, sure, sounds good. |
|
@arunsrinivasan just a heads up that setting |
|
Okay thanks, will take a look. |
|
Thought I would drop a mention of #2893 here as the two seem closely related. |
|
(Similar to my last comment...) Having a data.table like...
I cannot really read the contents of my list column, even though there is a print method for it. It would be nice to have a way to tell data.table how I want a list column of a certain class printed, like ...
Could have that list passed by the user in |
|
(If I want to suggest an addition to this list, do I add it here or add it as a discrete issue?) |
|
you can just add it here. feel free to edit initial post but also include a
comment w some exposition please
…On Mon, Feb 4, 2019, 10:19 AM HughParsonage ***@***.*** wrote:
(If I want to suggest an addition to this list, do I add it here or add it
as a discrete issue?)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1523 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHQQdd5pO_1tQjE7BL_B2i2dGeRN4p5yks5vJ5jNgaJpZM4HUz9_>
.
|
|
The less points is defined in scope the more easy is to merge a PR for it. Definitely it make sense to separate points which may result in breaking change (if any) from those for which default behaviour will not change. |
|
this won't be done in a single PR though, but rather one by one
…On Mon, Feb 4, 2019, 12:23 PM Jan Gorecki ***@***.*** wrote:
The less points is defined in scope the more easy is to merge a PR for it.
Definitely it make sense to separate points which may result in breaking
change (if any) from those for which default behaviour will not change.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1523 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHQQdeNB5EZPMn44zsIfag--2jsQwZTyks5vJ7WmgaJpZM4HUz9_>
.
|
|
^ related: #2842 |
|
That would be awesome! |
|
hi all I don't know if you care but I noticed a bug in
output on my system is:
You can see in the output above that the column names V22 through V30 are printed, but I expected they should not be. What I expected:
|
Current task list:
.Rdfile forprint.data.table3. Ability to turn off smart table wrapping [2) from #645/R-F#1957 - Yike Lu]by-groupings [4) from #645/R-F#1957 - Yike Lu]7. Demarcation of key columns [part of 5) from #645/R-F#1957 - Yike Lu]dplyr-like printing [see below - @MichaelChirico]dplyrtbl_df[#1497 - @nverno; #2608 - @vlulla]data.table[#545/R-F#5253 - @arunsrinivasan]list/non-atomic columns [see below - @franknarf1 via SO; also #605; handled in #2562]POSIXctcolumns with timezones should include that information in printed output [#2842 - @MichaelChirico]print.data.tablewould exceedmax.print)Some Notes
3 (tabled pending clarification)
As I understand it, this issue is a request to prevent the console output from wrapping around (i.e., to force all columns to appear parallel, regardless of how wide the table is).
If that's the case, this is (AFAICT) impossible, since that's something done by RStudio/R itself. I for one certainly don't know of any way to alter this behavior.
If someone does know of a way to affect this, or if they think I'm mis-interpreting, please pipe up and we can have this taken care of.
7
As I see it there are two options here. One is to treat all key columns the same; the other is to treat secondary, tertiary, etc. keys separately.
Example output:
And of course, add an option for deciding whether to demarcate with
|or some other user's-choice character (*,+, etc.)9 [DONE]
Some feedback from a closed PR that was a first stab at solving this:
From Arun regarding preferred options:
10 [DONE]
It would be nice to have an option to print a row under the row of column names which gives each column's stored type, as is currently (I understand) the default for the output of
dplyroperations.Example from
dplyr:Current best alternative is to do
sapply(DF, class), but it's nice to have a preview of the data wit this extra information.11
This seems closely related to 3. Current plan is to implement this as an alternative to 3 since it seems more tangible/doable.
Via @nverno:
and the guiding example from Arun:
12
Currently covered by @jangorecki's PR #1448; Jan, assuming #1529 is merged first, could you edit the
print.data.tableman page for your PR?