Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix label_number_si() to use SI prefixes #235

Merged
merged 30 commits into from Mar 26, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
ff9529b
Edit label_number_si() to use SI prefixes
davidchall Nov 20, 2019
255572b
Update test to send another argument to number()
davidchall Nov 20, 2019
3c491a6
Fix unicode error emitted by R CMD check
davidchall Nov 20, 2019
6060716
Remove non-ASCII character from docs
davidchall Nov 20, 2019
9b32153
Fix unicode mismatch on Windows
davidchall Nov 20, 2019
c3c4aed
Another attempt to resolve Windows unicode
davidchall Feb 20, 2020
eb5f06c
Share SI prefixes with label_bytes
davidchall Feb 20, 2020
073d476
Restore whitespace
davidchall Feb 20, 2020
ba1b2de
Add billion_scale argument to label_dollar()
davidchall Feb 20, 2020
919f3d6
Remove wikipedia hyperlink
davidchall Feb 20, 2020
5cb01e1
Merge branch 'master' into label-si
davidchall Aug 22, 2020
9f9993f
Merge branch 'master' into label-si
davidchall Mar 17, 2021
90a6b7b
Rename argument as rescale_large
davidchall Mar 17, 2021
09f6f45
Work with accuracy and scale arguments
davidchall Mar 17, 2021
3e7ec8e
Clarify short scale used internationally for finance
davidchall Mar 17, 2021
3c1403d
Refactor common code in rescale_by_suffix()
davidchall Mar 18, 2021
7c40c25
Set default accuracy to NULL
davidchall Mar 18, 2021
9a1c8ee
Fix conflicting factor levels on R 3.4
davidchall Mar 18, 2021
e22bcee
Rename short/long scale functions
davidchall Mar 24, 2021
92a5ba8
Move SI prefixes into SI file
davidchall Mar 24, 2021
2875c4d
label_bytes() uses rescale_by_suffix()
davidchall Mar 24, 2021
e83764b
label_number_si() supports scale argument
davidchall Mar 24, 2021
ff06709
Remove sep argument from label_number_si()
davidchall Mar 24, 2021
e10d289
First argument of label_number_si() is unit
davidchall Mar 24, 2021
e80ae91
Require unit argument
davidchall Mar 24, 2021
cffa7c7
Update NEWS
davidchall Mar 25, 2021
b414869
NEWS update
davidchall Mar 25, 2021
4a632a5
Document when `scale` argument is useful
davidchall Mar 25, 2021
b82c1cc
Remove headings from NEWS
davidchall Mar 25, 2021
39d48d8
Fix docs typo
davidchall Mar 25, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion DESCRIPTION
Expand Up @@ -37,4 +37,4 @@ Suggests:
Encoding: UTF-8
LazyLoad: yes
Roxygen: list(markdown = TRUE, r6 = FALSE)
RoxygenNote: 7.1.0
RoxygenNote: 7.1.1
2 changes: 2 additions & 0 deletions NAMESPACE
Expand Up @@ -132,10 +132,12 @@ export(pvalue_format)
export(reciprocal_trans)
export(regular_minor_breaks)
export(rescale)
export(rescale_long_scale)
export(rescale_max)
export(rescale_mid)
export(rescale_none)
export(rescale_pal)
export(rescale_short_scale)
export(reverse_trans)
export(scientific)
export(scientific_format)
Expand Down
22 changes: 22 additions & 0 deletions NEWS.md
Expand Up @@ -3,6 +3,28 @@
* `manual_pal()` now always returns an unnamed colour vector, which is easy to
use with `ggplot2::discrete_scale()` (@yutannihilation, #284).

* `label_number_si()` now correctly uses [SI prefixes](https://en.wikipedia.org/wiki/Metric_prefix)
(e.g. abbreviations "k" for "kilo-" and "m" for "milli-"). It previously used
[short scale abbreviations](https://en.wikipedia.org/wiki/Long_and_short_scales)
(e.g. "M" for million, "B" for billion). The short scale is most commonly used
in finance, so it is now supported via the new `rescale_large` argument of
`label_dollar()` (@davidchall, #235).

* `label_number_si()` now requires the `unit` argument is specified. The default
value of the `accuracy` argument is now `NULL`, which automatically chooses
the precision. The `sep` argument is removed, which had no purpose (@davidchall, #235).

* `label_dollar()` gains a `rescale_large` argument to support scaling of large
numbers by suffix (e.g. "M" for million, "B" for billion). In finance, the
short scale is most prevalent (i.e. 1 billion = 1 thousand million). In other
contexts, the long scale might be desired (i.e. 1 billion = 1 million million).
These two common scales are supported by setting `rescale_large = rescale_short_scale()`
or `rescale_large = rescale_long_scale()`, but custom scaling-by-suffix is also
supported (@davidchall, #235).

* `label_bytes()` now correctly accounts for the `scale` argument when choosing
auto units (@davidchall, #235).

# scales 1.1.1

* `breaks_width()` now handles `difftime`/`hms` objects (@bhogan-mitre, #244).
Expand Down
29 changes: 13 additions & 16 deletions R/label-bytes.R
@@ -1,4 +1,4 @@
#' Label bytes (1 kb, 2 MB, etc)
#' Label bytes (1 kB, 2 MB, etc)
#'
#' Scale bytes into human friendly units. Can use either SI units (e.g.
#' kB = 1000 bytes) or binary units (e.g. kiB = 1024 bytes). See
Expand All @@ -10,7 +10,7 @@
#' SI units (base 1000).
#' * "kiB", "MiB", "GiB", "TiB", "PiB", "EiB", "ZiB", and "YiB" for
#' binary units (base 1024).
#' * `auto_si` or `auto_binary` to automatically pick the most approrpiate
#' * `auto_si` or `auto_binary` to automatically pick the most appropriate
#' unit for each value.
#' @inheritParams number_format
#' @param ... Other arguments passed on to [number()]
Expand All @@ -37,7 +37,7 @@
#' breaks = breaks_width(250 * 1024),
#' label = label_bytes("auto_binary")
#' )
label_bytes <- function(units = "auto_si", accuracy = 1, ...) {
label_bytes <- function(units = "auto_si", accuracy = 1, scale = 1, ...) {
stopifnot(is.character(units), length(units) == 1)
force_all(accuracy, ...)

Expand All @@ -48,8 +48,10 @@ label_bytes <- function(units = "auto_si", accuracy = 1, ...) {
base <- switch(units, auto_binary = 1024, auto_si = 1000)
suffix <- switch(units, auto_binary = "iB", auto_si = "B")

power <- findInterval(abs(x), c(0, base^powers)) - 1L
units <- paste0(c("", names(powers))[power + 1L], suffix)
rescale <- rescale_by_suffix(x * scale, breaks = c(0, base^powers))

suffix <- paste0(" ", rescale$suffix, suffix)
scale <- scale * rescale$scale
} else {
si_units <- paste0(names(powers), "B")
bin_units <- paste0(names(powers), "iB")
Expand All @@ -63,22 +65,17 @@ label_bytes <- function(units = "auto_si", accuracy = 1, ...) {
} else {
stop("'", units, "' is not a valid unit", call. = FALSE)
}

suffix <- paste0(" ", units)
scale <- scale / base^power
}

number(
x / base^power,
x,
accuracy = accuracy,
suffix = paste0(" ", units),
scale = scale,
suffix = suffix,
...
)
}
}

# Helpers -----------------------------------------------------------------

si_powers <- (-8:8) * 3
names(si_powers) <- c(
rev(c("m", "\u00b5", "n", "p", "f", "a", "z", "y")), "",
"k", "M", "G", "T", "P", "E", "Z", "Y"
)
si_powers
56 changes: 52 additions & 4 deletions R/label-dollar.R
Expand Up @@ -14,6 +14,11 @@
#' value is less than `largest_with_cents` which by default is 100,000.
#' @param prefix,suffix Symbols to display before and after value.
#' @param negative_parens Display negative using parentheses?
#' @param rescale_large Named list indicating suffixes given to large values
#' (e.g. thousands, millions, billions, trillions). Name gives suffix, and
#' value specifies the power-of-ten. The two most common scales are provided
#' (`rescale_short_scale()` and `rescale_long_scale()`).
#' If `NULL`, the default, these suffixes aren't used.
#' @param ... Other arguments passed on to [base::format()].
#' @export
#' @family labels for continuous scales
Expand All @@ -23,7 +28,7 @@
#'
#' # Customise currency display with prefix and suffix
#' demo_continuous(c(1, 100), labels = label_dollar(prefix = "USD "))
#' euro <- dollar_format(
#' euro <- label_dollar(
#' prefix = "",
#' suffix = "\u20ac",
#' big.mark = ".",
Expand All @@ -33,10 +38,26 @@
#'
#' # Use negative_parens = TRUE for finance style display
#' demo_continuous(c(-100, 100), labels = label_dollar(negative_parens = TRUE))
#'
#' # In finance the short scale is most prevalent
#' dollar <- label_dollar(rescale_large = rescale_short_scale())
#' demo_log10(c(1, 1e18), breaks = log_breaks(7, 1e3), labels = dollar)
#'
#' # In other contexts the long scale might be used
#' long <- label_dollar(prefix = "", rescale_large = rescale_long_scale())
#' demo_log10(c(1, 1e18), breaks = log_breaks(7, 1e3), labels = long)
#'
#' # You can also define a custom naming scheme
#' gbp <- label_dollar(
#' prefix = "\u00a3",
#' rescale_large = c(k = 3L, m = 6L, bn = 9L, tn = 12L)
#' )
#' demo_log10(c(1, 1e12), breaks = log_breaks(5, 1e3), labels = gbp)
label_dollar <- function(accuracy = NULL, scale = 1, prefix = "$",
suffix = "", big.mark = ",", decimal.mark = ".",
trim = TRUE, largest_with_cents = 100000,
negative_parens = FALSE, ...) {
negative_parens = FALSE, rescale_large = NULL,
...) {
force_all(
accuracy,
scale,
Expand All @@ -47,6 +68,7 @@ label_dollar <- function(accuracy = NULL, scale = 1, prefix = "$",
trim,
largest_with_cents,
negative_parens,
rescale_large,
...
)
function(x) dollar(
Expand All @@ -60,6 +82,7 @@ label_dollar <- function(accuracy = NULL, scale = 1, prefix = "$",
trim = trim,
largest_with_cents = largest_with_cents,
negative_parens,
rescale_large = rescale_large,
...
)
}
Expand All @@ -86,9 +109,10 @@ dollar_format <- label_dollar
dollar <- function(x, accuracy = NULL, scale = 1, prefix = "$",
suffix = "", big.mark = ",", decimal.mark = ".",
trim = TRUE, largest_with_cents = 100000,
negative_parens = FALSE, ...) {
negative_parens = FALSE, rescale_large = NULL,
...) {
if (length(x) == 0) return(character())
if (is.null(accuracy)) {
if (is.null(accuracy) && is.null(rescale_large)) {
if (needs_cents(x * scale, largest_with_cents)) {
accuracy <- .01
} else {
Expand All @@ -102,6 +126,18 @@ dollar <- function(x, accuracy = NULL, scale = 1, prefix = "$",
negative <- !is.na(x) & x < 0
x <- abs(x)

if (!is.null(rescale_large)) {
if (!(is.integer(rescale_large) && all(rescale_large > 0))) {
stop("`rescale_large` must be positive integers.", call. = FALSE)
}

rescale <- rescale_by_suffix(x * scale, breaks = c(0, 10^rescale_large))
davidchall marked this conversation as resolved.
Show resolved Hide resolved

sep <- if (suffix == "") "" else " "
suffix <- paste0(rescale$suffix, sep, suffix)
scale <- scale * rescale$scale
}

amount <- number(
x,
accuracy = accuracy,
Expand All @@ -126,3 +162,15 @@ dollar <- function(x, accuracy = NULL, scale = 1, prefix = "$",

amount
}

#' @export
#' @rdname label_dollar
rescale_short_scale <- function() {
c(K = 3L, M = 6L, B = 9L, T = 12L)
}

#' @export
#' @rdname label_dollar
rescale_long_scale <- function() {
c(K = 3L, M = 6L, B = 12L, T = 18L)
}
61 changes: 35 additions & 26 deletions R/label-number-si.R
@@ -1,46 +1,55 @@
#' Label numbers with SI prefixes (2k, 1M, 5T etc)
#' Label numbers with SI prefixes (2 kg, 5 mm, etc)
#'
#' `number_si()` automatically scales and labels with the best SI prefix,
#' "K" for values \eqn{\ge} 10e3, "M" for \eqn{\ge} 10e6,
#' "B" for \eqn{\ge} 10e9, and "T" for \eqn{\ge} 10e12.
#' `label_number_si()` automatically adds the most suitable SI prefix and scales
#' the values appropriately. For example, values greater than 1000 gain a "k"
#' prefix (abbreviated from "kilo-") and are scaled by 1/1000.
#' See [Metric Prefix](https://en.wikipedia.org/wiki/Metric_prefix) on Wikipedia
#' for more details.
#'
#' @inherit number_format return params
#' @param unit Optional units specifier.
#' @param sep Separator between number and SI unit. Defaults to `" "` if
#' `units` is supplied, and `""` if not.
#' @param unit Unit of measurement (e.g. `"m"` for meter, the SI unit of length).
#' @param scale A scaling factor: `x` will be multiplied by `scale` before
#' formatting. This is useful if the underlying data is already using an SI
#' prefix.
#' @export
#' @family labels for continuous scales
#' @family labels for log scales
#' @examples
#' demo_continuous(c(1, 1e9), label = label_number_si())
#' demo_continuous(c(1, 5000), label = label_number_si(unit = "g"))
#' demo_continuous(c(1, 1000), label = label_number_si(unit = "m"))
#' demo_continuous(c(1, 1000), labels = label_number_si("m"))
#'
#' demo_log10(c(1, 1e9), breaks = log_breaks(10), labels = label_number_si())
label_number_si <- function(accuracy = 1, unit = NULL, sep = NULL, ...) {
sep <- if (is.null(unit)) "" else " "
#' demo_log10(c(1, 1e9), breaks = log_breaks(10), labels = label_number_si("m"))
#' demo_log10(c(1e-9, 1), breaks = log_breaks(10), labels = label_number_si("g"))
#'
#' # use scale when data already uses SI prefix (e.g. stored in kg)
#' kg <- label_number_si("g", scale = 1e3)
#' demo_log10(c(1e-9, 1), breaks = log_breaks(10), labels = kg)
label_number_si <- function(unit, accuracy = NULL, scale = 1, ...) {
sep <- if (is.null(unit) || !nzchar(unit)) "" else " "
force_all(accuracy, ...)

function(x) {
breaks <- c(0, 10^c(K = 3, M = 6, B = 9, T = 12))

n_suffix <- cut(abs(x),
breaks = c(unname(breaks), Inf),
labels = c(names(breaks)),
right = FALSE
)
n_suffix[is.na(n_suffix)] <- ""
suffix <- paste0(sep, n_suffix, unit)
rescale <- rescale_by_suffix(x * scale, breaks = 10^si_powers)

scale <- 1 / breaks[n_suffix]
# for handling Inf and 0-1 correctly
scale[which(scale %in% c(Inf, NA))] <- 1
suffix <- paste0(sep, rescale$suffix, unit)
scale <- scale * rescale$scale

number(x,
accuracy = accuracy,
scale = unname(scale),
scale = scale,
suffix = suffix,
...
)
}
}

# power-of-ten prefixes used by the International System of Units (SI)
# https://www.bipm.org/en/measurement-units/prefixes.html
#
# note: irregular prefixes (hecto, deca, deci, centi) are not stored
# because they don't commonly appear in scientific usage anymore
si_powers <- (-8:8) * 3
names(si_powers) <- c(
rev(c("m", "\u00b5", "n", "p", "f", "a", "z", "y")), "",
"k", "M", "G", "T", "P", "E", "Z", "Y"
)
si_powers
2 changes: 1 addition & 1 deletion R/label-number.r
Expand Up @@ -24,7 +24,7 @@
#'
#' Applied to rescaled data.
#' @param scale A scaling factor: `x` will be multiplied by `scale` before
#' formating. This is useful if the underlying data is very small or very
#' formatting. This is useful if the underlying data is very small or very
#' large.
#' @param prefix,suffix Symbols to display before and after value.
#' @param big.mark Character used between every 3 digits to separate thousands.
Expand Down
20 changes: 20 additions & 0 deletions R/rescale_by_suffix.R
@@ -0,0 +1,20 @@
# each value of x is assigned a suffix and associated scaling factor
rescale_by_suffix <- function(x, breaks) {
suffix <- as.character(cut(
abs(x),
breaks = c(unname(breaks), Inf),
labels = names(breaks),
right = FALSE
))
suffix[is.na(suffix)] <- names(which.min(breaks))

scale <- unname(1 / breaks[suffix])
scale[which(scale %in% c(Inf, NA))] <- 1

# exact zero is not scaled
x_zero <- which(abs(x) == 0)
scale[x_zero] <- 1
suffix[x_zero] <- ""

list(scale = scale, suffix = suffix)
}
2 changes: 1 addition & 1 deletion man/brewer_pal.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 7 additions & 3 deletions man/label_bytes.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.