New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't print decimals if no value in vector has decimals. #62

Closed
strengejacke opened this Issue Nov 3, 2017 · 7 comments

Comments

Projects
None yet
4 participants
@strengejacke

strengejacke commented Nov 3, 2017

When importing data from other software packages into R (e.g. from Stata, SAS or SPSS, using haven), vector are of type double, even if they are integers.

Would you mind checking if a vector has "floating point" values, or are actually "interger-doubles", and then omit the decimals? (something like is.numeric(x) && !all(x %% 1 == 0, na.rm = T))

Current output:

library(tibble)
tibble(a = c(1, 2, 3), b = c(1L, 2L, 3L))
#> # A tibble: 3 x 2
#>       a     b
#>   <dbl> <int>
#> 1  1.00     1
#> 2  2.00     2
#> 3  3.00     3

Since all values in a are "integers", the desired output would be like column b. The problem is, that this is a guess, if it's a double or probably was intended as integer. But I can think of (new) R users being confused when they see their values in the SPSS data sheet as "integers", and in the R console as doubles.

@strengejacke

This comment has been minimized.

strengejacke commented Nov 3, 2017

Another example: When I print the vector alone, the desired output is shown:

library(tibble)
x <- tibble(a = c(1, 2, 3), b = c(1L, 2L, 3L), c = c(1.23, 1.34, 1.45))
x$a
#> [1] 1 2 3
x$b
#> [1] 1 2 3
x$c
#> [1] 1.23 1.34 1.45

Same behaviour would be nice for tibbles as well - I hope you think this makes sense.

@krlmlr

This comment has been minimized.

Member

krlmlr commented Jan 10, 2018

@hadley: Should we omit the dot and the zeros after the dot if we only see whole numbers? Current output:

pillar::pillar(as.numeric(1:3))
#> <dbl>
#>  1.00
#>  2.00
#>  3.00

Created on 2018-01-10 by the reprex package (v0.1.1.9000)

@hadley

This comment has been minimized.

Member

hadley commented Jan 10, 2018

Hmmm, I don't think it's a good idea to do this. For performance reasons, we can only inspect the rows being printed, so this seems potentially misleading to me.

@krlmlr

This comment has been minimized.

Member

krlmlr commented Jan 11, 2018

I've thought about that, too. What's the chance that a column contains fractions if the first 10 entries don't have any? If we assume that only .0 and .5 are present, and a uniform distribution, that's < 0.1%. If we assume .0 through .9, that's 10⁻¹⁰. We have the type indicator, too.

The digits.secs option also triggers fractional seconds only for the displayed data.

We can really fix this only if the column contains some metadata that describes all values.

Maybe make this an option? Printing only the dot but not the trailing zeros doesn't look appealing to me:

pillar::pillar(as.numeric(1:3))
#> <dbl>
#>    1.
#>    2.
#>    3.
@hadley

This comment has been minimized.

Member

hadley commented Jan 11, 2018

Based on readr experience, quite high.

I'd rather not add more options.

@mentalplex

This comment has been minimized.

mentalplex commented Jan 27, 2018

I'd agree with @hadley, that there are many cases where the first 10 entries don't include any digits to the right of the decimal, while somewhere in the data they do, but I'm not sure that's more common than the other way around. You may be trying to avoid a common but minority misrepresentation by using a method that misrepresents the data the majority of the time.

I understand the performance benefit for only checking the rows that are printed. If that's the way pillar displays data (check the rows you print), why not have that be the data you're representing (the rows you print)?

Trailing zeros have a meaning. They mean somewhere in this data there is an entry with values to the right of the decimal. If printing using pillar is supposed to give you information about ALL the data (instead of just the data it prints) while only checking a portion, you're either going to have to find some magic, cache the checks of the entire data when an object is created (change other packages), or decide between two cases where the wrong meaning is displayed (as a trade off for the performance). In one case the display tells you there are later, unprinted entries with values to the right of the decimal (when there aren't); in the other case the display tells you there are no later unprinted entries with values to the right of the decimal (when there are). I'm not sure it's clear that the second option (the new way tibbles print) is better than the first.

I'd also add that this behavior for data with no values to the right of the decimal is not the way printing that same data was handled by tibbles previously. So, though different things surprise different people, this will be surprising (at least for a while) for most users of the tidyverse.

@krlmlr

This comment has been minimized.

Member

krlmlr commented Feb 7, 2018

Closing in favor of #40: Adding a trailing dot but without decimals in these cases.

@krlmlr krlmlr closed this Feb 7, 2018

krlmlr added a commit that referenced this issue Feb 26, 2018

Merge tag 'v1.2.0'
Display
-------

- Turned off using subtle style for digits that are considered insignificant.  Set the new option `pillar.subtle_num` to `TRUE` to turn it on again (default: `FALSE`).
- The negation sign is printed next to the number again (#91).
- Scientific notation uses regular digits again for exponents (#90).
- Groups of three digits are now underlined, starting with the fourth before/after the decimal point. This gives a better idea of the order of magnitude of the numbers (#78).
- Logical columns are displayed as `TRUE` and `FALSE` again (#95).
- The decimal dot is now always printed for numbers of type `numeric`. Trailing zeros are not displayed anymore if all displayed numbers are whole numbers (#62).
- Decimal values longer than 13 characters always print in scientific notation.

Bug fixes
---------

- Numeric values with a `"class"` attribute (e.g., `Duration` from lubridate) are now formatted using `format()` if the `pillar_shaft()` method is not implemented for that class (#88).
- Very small numbers (like `1e-310`) are now printed corectly (tidyverse/tibble#377).
- Fix representation of right-hand side for `getOption(pillar.sigfig) >= 6` (tidyverse/tibble#380).
- Fix computation of significant figures for numbers with absolute value >= 1 (#98).

New functions
-------------

- New styling helper `style_subtle_num()`, formatting depends on the `pillar.subtle_num` option.

krlmlr added a commit that referenced this issue Feb 27, 2018

Merge tag 'v1.2.1'
Display
-------

- Turned off using subtle style for digits that are considered insignificant.  Negative numbers are shown all red.  Set the new option `pillar.subtle_num` to `TRUE` to turn it on again (default: `FALSE`).
- The negation sign is printed next to the number again (#91).
- Scientific notation uses regular digits again for exponents (#90).
- Groups of three digits are now underlined, starting with the fourth before/after the decimal point. This gives a better idea of the order of magnitude of the numbers (#78).
- Logical columns are displayed as `TRUE` and `FALSE` again (#95).
- The decimal dot is now always printed for numbers of type `numeric`. Trailing zeros are not shown anymore if all displayed numbers are whole numbers (#62).
- Decimal values longer than 13 characters always print in scientific notation.

Bug fixes
---------

- Numeric values with a `"class"` attribute (e.g., `Duration` from lubridate) are now formatted using `format()` if the `pillar_shaft()` method is not implemented for that class (#88).
- Very small numbers (like `1e-310`) are now printed corectly (tidyverse/tibble#377).
- Fix representation of right-hand side for `getOption("pillar.sigfig") >= 6` (tidyverse/tibble#380).
- Fix computation of significant figures for numbers with absolute value >= 1 (#98).

New functions
-------------

- New styling helper `style_subtle_num()`, formatting depends on the `pillar.subtle_num` option.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment