-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New approach to variable rescaling #339
Conversation
@davidchall I'll probably start the release process next Friday, so if you have time before then I'd love to get your feedback. |
Hi @hadley - I remember the design being quite tricky when I wrote this code a couple of years ago, so I really appreciate you taking another look at it before release. I wondered about surfacing the long and short scales in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is absolutely fantastic! 🎉 It feels much more natural, and I'm sure it will be easier to maintain as a result. Most of my feedback was very minor.
My only remaining concern is about assigning units to the number zero. My previous solution would output "0 B" from the bytes labeler, "0 m" from the SI unit labeler, and "0" from the currency labeler (which has no unit). I think these now zeros always omit the unit. I think this has a couple of disadvantages, and I wonder if others would report this later?
-
On plot axes, it'll look weird to have inconsistent tick labels (e.g.
0, 0.5 m, 1.0 m, 1.5 m, 2.0 m
). -
From a scientific standpoint, I think there is a difference between "0" and "0 m". The latter implies that the measured quantity has a dimension of length. To omit the unit is odd because it suggests the dimension of the axis is inconsistent (length and dimensionless).
In conclusion, I can't wait to start using these functions! 😄
}) | ||
|
||
test_that("handles out-of-range inputs", { | ||
expect_equal(number(1e15, scale_cut = cut_short_scale()), "1 000T") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should also test the low end, because cut()
is directional.
expect_equal(number(1e15, scale_cut = cut_short_scale()), "1 000T") | |
expect_equal(number(1e15, scale_cut = cut_short_scale()), "1 000T") | |
expect_equal(number(1e-25, scale_cut = cut_si("")), "0.1y") |
test_that("cut_si() adds space before unit", { | ||
expect_equal(number(c(1, 1000), scale_cut = cut_si("m")), c("1 m", "1 km")) | ||
}) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should test how cut_si("m")
handles zero. It seems like an edge case due to how cut_si()
bakes the units into the scale suffix. IMHO the expected result should be "0 m".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, should be fixed now.
NEWS.md
Outdated
for billion). It can be used with `cut_short_scale()` when billion = | ||
thousand million and `cut_long_scale()` when billion = million million. | ||
Additionally, the accuracy is now computed per scale category, so rescaled | ||
values can have different numbers of decimal places. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know I didn't implement the final version, but I did spend quite a bit of time thinking about this problem. Could I possibly request something like (with help from @davidchall)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, absolutely! Many apologies for accidentally editing out your contributions to this feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries!
Co-authored-by: David C Hall <davidchall@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review! I really appreciate your help making this code higher quality.
NEWS.md
Outdated
for billion). It can be used with `cut_short_scale()` when billion = | ||
thousand million and `cut_long_scale()` when billion = million million. | ||
Additionally, the accuracy is now computed per scale category, so rescaled | ||
values can have different numbers of decimal places. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, absolutely! Many apologies for accidentally editing out your contributions to this feature.
number(1, scale_cut = "x") | ||
number(1, scale_cut = c(x = 0, y = 1)) | ||
}) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem worth testing to me, as it's obvious via inspection of the code, and any test would effectively just be testing that base R generates a useful error message here.
test_that("cut_si() adds space before unit", { | ||
expect_equal(number(c(1, 1000), scale_cut = cut_si("m")), c("1 m", "1 km")) | ||
}) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, should be fixed now.
tests/testthat/test-label-dollar.R
Outdated
c("$1", "$1K", "$1M", "$1,000M", "$1B", "$1,000B", "$1T") | ||
) | ||
expect_equal( | ||
label_dollar(rescale_large = c(k = 3L, m = 6L, bn = 9L, tn = 12L))(x), | ||
label_dollar(scale_cut = c(k = 1e3, m = 1e6, bn = 1e9, tn = 1e12))(x), | ||
c("$1", "$1k", "$1m", "$1bn", "$1tn", "$1,000tn", "$1,000,000tn") | ||
) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I generally simplified these tests to focus on what label_dollar()
now implements.
auto units (@davidchall, #235) and leaves `0` as is (instead of formatting to | ||
"0 B") for consistency with `label_number_si()`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hadley I think this addition should be removed, now that we format as "0 B".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'll fix
@davidchall I started getting worried that the change to
label_number_si()
was likely to affect too much existing code, and after much refactoring this is what I ended up with:number()
gets a new argument calledscale_cut
, equivalent todollar()
'srescale_large
.rescale_long_scale()
andrescale_short_scale()
are renamed tocut_short_scale()
andcut_long_scale()
cut_si()
andcut_bytes()
to pull out the code previously nested insidelabel_number_si()
andlabel_bytes()
. This involved refactoringscale_cut()
to take a vector of break points, notlog10
break points.Along the way, I also fixed #264 by refactoring
number()
to allowaccuracy
to be a vector (which then requires some manual looping overnsmall
).I think overall this is a cleaner separation of concerns and should be a little more flexible because it's implemented in
number()
rather than the wrappers.