Don't print decimals if no value in vector has decimals. #62

Closed
opened this Issue Nov 3, 2017 · 7 comments

Projects
None yet
4 participants

strengejacke commented Nov 3, 2017 • edited

 When importing data from other software packages into R (e.g. from Stata, SAS or SPSS, using haven), vector are of type double, even if they are integers. Would you mind checking if a vector has "floating point" values, or are actually "interger-doubles", and then omit the decimals? (something like `is.numeric(x) && !all(x %% 1 == 0, na.rm = T)`) Current output: ```library(tibble) tibble(a = c(1, 2, 3), b = c(1L, 2L, 3L)) #> # A tibble: 3 x 2 #> a b #> #> 1 1.00 1 #> 2 2.00 2 #> 3 3.00 3``` Since all values in `a` are "integers", the desired output would be like column `b`. The problem is, that this is a guess, if it's a double or probably was intended as integer. But I can think of (new) R users being confused when they see their values in the SPSS data sheet as "integers", and in the R console as doubles.

strengejacke commented Nov 3, 2017

 Another example: When I print the vector alone, the desired output is shown: ```library(tibble) x <- tibble(a = c(1, 2, 3), b = c(1L, 2L, 3L), c = c(1.23, 1.34, 1.45)) x\$a #> [1] 1 2 3 x\$b #> [1] 1 2 3 x\$c #> [1] 1.23 1.34 1.45``` Same behaviour would be nice for tibbles as well - I hope you think this makes sense.
Member

krlmlr commented Jan 10, 2018

 @hadley: Should we omit the dot and the zeros after the dot if we only see whole numbers? Current output: ```pillar::pillar(as.numeric(1:3)) #> #> 1.00 #> 2.00 #> 3.00``` Created on 2018-01-10 by the reprex package (v0.1.1.9000)
Member

 Hmmm, I don't think it's a good idea to do this. For performance reasons, we can only inspect the rows being printed, so this seems potentially misleading to me.
Member

krlmlr commented Jan 11, 2018

 I've thought about that, too. What's the chance that a column contains fractions if the first 10 entries don't have any? If we assume that only `.0` and `.5` are present, and a uniform distribution, that's < 0.1%. If we assume `.0` through `.9`, that's 10⁻¹⁰. We have the type indicator, too. The `digits.secs` option also triggers fractional seconds only for the displayed data. We can really fix this only if the column contains some metadata that describes all values. Maybe make this an option? Printing only the dot but not the trailing zeros doesn't look appealing to me: ```pillar::pillar(as.numeric(1:3)) #> #> 1. #> 2. #> 3.```
Member

 Based on readr experience, quite high. I'd rather not add more options.

Closed

mentalplex commented Jan 27, 2018 • edited

 I'd agree with @hadley, that there are many cases where the first 10 entries don't include any digits to the right of the decimal, while somewhere in the data they do, but I'm not sure that's more common than the other way around. You may be trying to avoid a common but minority misrepresentation by using a method that misrepresents the data the majority of the time. I understand the performance benefit for only checking the rows that are printed. If that's the way pillar displays data (check the rows you print), why not have that be the data you're representing (the rows you print)? Trailing zeros have a meaning. They mean somewhere in this data there is an entry with values to the right of the decimal. If printing using pillar is supposed to give you information about ALL the data (instead of just the data it prints) while only checking a portion, you're either going to have to find some magic, cache the checks of the entire data when an object is created (change other packages), or decide between two cases where the wrong meaning is displayed (as a trade off for the performance). In one case the display tells you there are later, unprinted entries with values to the right of the decimal (when there aren't); in the other case the display tells you there are no later unprinted entries with values to the right of the decimal (when there are). I'm not sure it's clear that the second option (the new way tibbles print) is better than the first. I'd also add that this behavior for data with no values to the right of the decimal is not the way printing that same data was handled by tibbles previously. So, though different things surprise different people, this will be surprising (at least for a while) for most users of the tidyverse.
Member

krlmlr commented Feb 7, 2018

 Closing in favor of #40: Adding a trailing dot but without decimals in these cases.

krlmlr added a commit that referenced this issue Feb 26, 2018

``` Merge tag 'v1.2.0' ```
```Display
-------

- Turned off using subtle style for digits that are considered insignificant.  Set the new option `pillar.subtle_num` to `TRUE` to turn it on again (default: `FALSE`).
- The negation sign is printed next to the number again (#91).
- Scientific notation uses regular digits again for exponents (#90).
- Groups of three digits are now underlined, starting with the fourth before/after the decimal point. This gives a better idea of the order of magnitude of the numbers (#78).
- Logical columns are displayed as `TRUE` and `FALSE` again (#95).
- The decimal dot is now always printed for numbers of type `numeric`. Trailing zeros are not displayed anymore if all displayed numbers are whole numbers (#62).
- Decimal values longer than 13 characters always print in scientific notation.

Bug fixes
---------

- Numeric values with a `"class"` attribute (e.g., `Duration` from lubridate) are now formatted using `format()` if the `pillar_shaft()` method is not implemented for that class (#88).
- Very small numbers (like `1e-310`) are now printed corectly (tidyverse/tibble#377).
- Fix representation of right-hand side for `getOption(pillar.sigfig) >= 6` (tidyverse/tibble#380).
- Fix computation of significant figures for numbers with absolute value >= 1 (#98).

New functions
-------------

- New styling helper `style_subtle_num()`, formatting depends on the `pillar.subtle_num` option.```
``` 6fbdc2f ```

krlmlr added a commit that referenced this issue Feb 27, 2018

``` Merge tag 'v1.2.1' ```
```Display
-------

- Turned off using subtle style for digits that are considered insignificant.  Negative numbers are shown all red.  Set the new option `pillar.subtle_num` to `TRUE` to turn it on again (default: `FALSE`).
- The negation sign is printed next to the number again (#91).
- Scientific notation uses regular digits again for exponents (#90).
- Groups of three digits are now underlined, starting with the fourth before/after the decimal point. This gives a better idea of the order of magnitude of the numbers (#78).
- Logical columns are displayed as `TRUE` and `FALSE` again (#95).
- The decimal dot is now always printed for numbers of type `numeric`. Trailing zeros are not shown anymore if all displayed numbers are whole numbers (#62).
- Decimal values longer than 13 characters always print in scientific notation.

Bug fixes
---------

- Numeric values with a `"class"` attribute (e.g., `Duration` from lubridate) are now formatted using `format()` if the `pillar_shaft()` method is not implemented for that class (#88).
- Very small numbers (like `1e-310`) are now printed corectly (tidyverse/tibble#377).
- Fix representation of right-hand side for `getOption("pillar.sigfig") >= 6` (tidyverse/tibble#380).
- Fix computation of significant figures for numbers with absolute value >= 1 (#98).

New functions
-------------

- New styling helper `style_subtle_num()`, formatting depends on the `pillar.subtle_num` option.```
``` 2065df9 ```