Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Don't print decimals if no value in vector has decimals. #62
When importing data from other software packages into R (e.g. from Stata, SAS or SPSS, using haven), vector are of type double, even if they are integers.
Would you mind checking if a vector has "floating point" values, or are actually "interger-doubles", and then omit the decimals? (something like
library(tibble) tibble(a = c(1, 2, 3), b = c(1L, 2L, 3L)) #> # A tibble: 3 x 2 #> a b #> <dbl> <int> #> 1 1.00 1 #> 2 2.00 2 #> 3 3.00 3
Since all values in
Another example: When I print the vector alone, the desired output is shown:
library(tibble) x <- tibble(a = c(1, 2, 3), b = c(1L, 2L, 3L), c = c(1.23, 1.34, 1.45)) x$a #>  1 2 3 x$b #>  1 2 3 x$c #>  1.23 1.34 1.45
Same behaviour would be nice for tibbles as well - I hope you think this makes sense.
I've thought about that, too. What's the chance that a column contains fractions if the first 10 entries don't have any? If we assume that only
We can really fix this only if the column contains some metadata that describes all values.
Maybe make this an option? Printing only the dot but not the trailing zeros doesn't look appealing to me:
pillar::pillar(as.numeric(1:3)) #> <dbl> #> 1. #> 2. #> 3.
I'd agree with @hadley, that there are many cases where the first 10 entries don't include any digits to the right of the decimal, while somewhere in the data they do, but I'm not sure that's more common than the other way around. You may be trying to avoid a common but minority misrepresentation by using a method that misrepresents the data the majority of the time.
I understand the performance benefit for only checking the rows that are printed. If that's the way pillar displays data (check the rows you print), why not have that be the data you're representing (the rows you print)?
Trailing zeros have a meaning. They mean somewhere in this data there is an entry with values to the right of the decimal. If printing using pillar is supposed to give you information about ALL the data (instead of just the data it prints) while only checking a portion, you're either going to have to find some magic, cache the checks of the entire data when an object is created (change other packages), or decide between two cases where the wrong meaning is displayed (as a trade off for the performance). In one case the display tells you there are later, unprinted entries with values to the right of the decimal (when there aren't); in the other case the display tells you there are no later unprinted entries with values to the right of the decimal (when there are). I'm not sure it's clear that the second option (the new way tibbles print) is better than the first.
I'd also add that this behavior for data with no values to the right of the decimal is not the way printing that same data was handled by tibbles previously. So, though different things surprise different people, this will be surprising (at least for a while) for most users of the tidyverse.