-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
a generic number formatter #142
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basic idea looks great. I gave you a bunch of style comments because I think this is a good idea, and I wanted to make the implementation as strong as possible. Thanks for all your work on this!
R/formatter.r
Outdated
#' suffix = "\u2030", | ||
#' accuracy = .1) | ||
#' per_mille(v) | ||
number_format <- function(accuracy = 1, scale = 1, prefix = "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would make more sense to default big.mark
to ","
, and set decimal.mark = NULL
Then we can do:
if (is.null(decimal.mark)) {
decimal.mark = if (identical(big.mark, ",")) "." else ","
}
And then document big.mark
and decimal.mark
together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably because I'm coming from a different culture (I'm French and comma is used as a decimal separator), I'm personally not in favour of using a comma as the default thousands separator, for the following reasons:
- using comma as the thousands separator is the purpose of
comma_format
specific formatter; - in an international perspective, it would be more relevant to use a space as a thousands separator, as used for example by Lancet journal (a space for thousands separator and a dot for decimal separator being a good compromise for a good understanding for an international audience);
- since 2003, the use of spaces as separators (for example: 20 000 and 1 000 000 for "twenty thousand" and "one million") has been officially endorsed by SI/ISO 31-0 standard, as well as by the International Bureau of Weights and Measures and the International Union of Pure and Applied Chemistry (IUPAC), the American Medical Association's widely followed AMA Manual of Style, and the Metrication Board, among others. (cf. https://www.wikiwand.com/en/Decimal_separator#/Digit_grouping)
If a space is used for thousands separator, I would suggest to simply state "." as the default value for decimal.mark
, rather than NULL
as it avoids to have to consider an implicit rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand those reasons, but it would be unfortunately inconsistent with every other number formatting function in the tidyverse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I have changed the default behaviour,.
By default, big.mark
is NULL. If NULL, a comma will be used except if decimal.mark
is already a comma and in such case a space will be used for thousands separators.
R/formatter.r
Outdated
#' my_format(v) | ||
#' | ||
#' # Per mille | ||
#' per_mille <- number_format(scale = 1000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please use tidyverse style guide here? http://style.tidyverse.org/syntax.html#long-lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I forgot that styler
do not check documentation examples.
It is corrected.
R/formatter.r
Outdated
@@ -1,3 +1,72 @@ | |||
#' Number formatter: a generic formatter for numbers | |||
#' | |||
#' @return \code{number_format} returns a function with single parameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please use markdown formatting, i.e. `number_format`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
corrected
R/formatter.r
Outdated
#' decimal point | ||
#' @param trim logical, if \code{FALSE}, values are right-justified to a common | ||
#' width (see \code{\link[base]{format}}) | ||
#' @param ... other arguments passed on to \code{\link[base]{format}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Markdown link syntax is [base::format()]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
corrected
R/formatter.r
Outdated
ret <- paste0( | ||
prefix, | ||
format( | ||
scale * x, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be x / scale
? I don't think I understand the name scale
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally, in the first version of this generic formatter, I used the word multiplier
, feeling it was more explicit.
However, when checking the other existing formatters available in scales
, it appeared that unit_format
already used such argument, called scale
and returning x * scale
.
Therefore, I changed the name of that argument of number_format
to scale
to be consistent with unit_format
.
We can decide to use another wording. However, my feeling is that we should try to have consistent names and behaviours between all formatters, as much as possible.
However, I do not know if you want to maintain backward compatibility with previous version of scales
or if you prefer to reorganize all formatters, even if it's introducing some breaking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok, leave as scale
then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe emphasise that x
is multiplied by scale
in the docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected
R/formatter.r
Outdated
prefix, | ||
format( | ||
scale * x, | ||
big.mark = big.mark, decimal.mark = decimal.mark, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please put each named argument on a separate line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
corrected
@@ -8,6 +8,33 @@ test_that("time_format formats hms objects", { | |||
expect_equal(time_format(format = "%H")(hms::as.hms(a_time, tz = "UTC")), "11") | |||
}) | |||
|
|||
|
|||
test_that("number format works correctly", { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please break up in to smaller tests? Like one for accuracy, one for decimal vs big mark, one for scale, one for prefix/suffix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests have been simplified and broken into smaller tests
R/formatter.r
Outdated
suffix = "", big.mark = " ", decimal.mark = ".", | ||
trim = TRUE, ...) { | ||
function(x) number( | ||
x, accuracy, scale, prefix, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please name arguments here (apart from x
). That makes it safer if we later change the arguments to number()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
corrected
R/formatter.r
Outdated
suffix = "", big.mark = " ", decimal.mark = ".", | ||
trim = TRUE, ...) { | ||
if (length(x) == 0) return(character()) | ||
if (is.null(accuracy)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make the code simpler to do
accuracy <- accuracy %||% precision(x)
x <- round_any(x, accuracy / scale)
nsmall <- -floor(log10(accuracy))
nsmall <- min(max(nsmall, 0), 20)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, clearly
R/formatter.r
Outdated
nsmall <- min(max(nsmall, 0), 20) | ||
ret <- paste0( | ||
prefix, | ||
format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might make the code a bit simpler to pull this out into a variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, I'm not sure to understand your comment.
Do you mean something like:
ret <- format(
scale * x,
big.mark = big.mark,
decimal.mark = decimal.mark,
trim = trim,
nsmall = nsmall,
scientific = FALSE,
...
)
ret <- paste0(prefix, ret, suffix)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking really good - thanks. Just a few more minor suggestions.
@@ -1,3 +1,84 @@ | |||
#' Number formatter: a generic formatter for numbers | |||
#' | |||
#' @return `number_format` returns a function with single parameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please ident subsequent lines with two spaces? http://style.tidyverse.org/documentation.html#indents-and-line-breaks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
corrected
R/formatter.r
Outdated
#' | ||
#' @return `number_format` returns a function with single parameter | ||
#' `x`, a numeric vector, that returns a character vector | ||
#' @param x a numeric vector to format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please make parameter descriptions into sentences? i.e. start with a capital letter and end with a full stop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. I will update the PR tonight or tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
R/formatter.r
Outdated
nsmall <- -floor(log10(accuracy)) | ||
nsmall <- min(max(nsmall, 0), 20) | ||
|
||
if (is.null(big.mark)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmmm, now this logic seems like it's the wrong way around. Maybe it's decimal.mark
that should be NULL
?
This is making me feel that you're correct that the default big.mark should be " "
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not so easy to find an appropriate behaviour if we set decimal.mark
to NULL
.
Your previous proposal was:
if (is.null(decimal.mark)) {
decimal.mark = if (identical(big.mark, ",")) "." else ","
}
But this behaviour is probably not the best. If I need to present results in an international format, I have to specify big.mark = " "
but I want to keep a dot as a decimal separator.
If I want to present results in French, the first thing I want to set is the decimal.mark
as a comma, and the change of big.mark
is a consequence of this previous change.
Therefore, there is no general rule to derive decimal.mark
from big.mark
.
It is true that using a space as the default thousands separator is more generic as it is working with different decimal marks.
In addition, for those in need to present results in an American style, there is still the comma_format
that will be simply a short cut for number_format
with a comma as the default decimal mark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, lets go back to your original proposal with big.mark = " "
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
And can you please add a bullet to NEWS? It should briefly describe the change and end with |
I applied all requested corrections and I squashed all PR commits into one |
Thanks for all your hard work on this! |
You are welcome. Let me know how you want to deal now with the existing formatters and if you want to base them on the generic formatter. It could be an opportunity to harmonise arguments between the different formatters.
Is
Let me know if you want to reorganise these formatters using If yes, would you prefer separate PR or only one PR? Would you prefer to keep a separate documentation page or to present the different alias on the same page? |
Maybe start with PR that uses I'd prefer to have them all documented together. |
cf. discussion in #77