New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sec.axis on a log10 scale uses non-transformed breaks #2729

Closed
jemus42 opened this Issue Jul 3, 2018 · 2 comments

Comments

Projects
None yet
3 participants
@jemus42
Contributor

jemus42 commented Jul 3, 2018

I want to use a log10 y scale for dollar amounts, and a second axis for € conversion, but I can't figure out how to make the second axis have the "nice" breaks automatically used on the original axis. I thought dup_axis and the derive() helper would do the trick, but apparently not.

library(ggplot2)

df <- data.frame(x = rep(c("A", "B", "C"), 100),
                 y = sample(c(20, 20e6, 1e6, 50, 100), 300, replace = TRUE))

ggplot(data = df, aes(x, y)) +
  geom_boxplot() +
  scale_y_log10(labels = scales::dollar, sec.axis = sec_axis(~.*0.85))

ggplot(data = df, aes(x, y)) +
  geom_boxplot() +
  scale_y_log10(labels = scales::dollar, sec.axis = dup_axis(~.*0.85))

packageVersion("ggplot2")
#> [1] '2.2.1.9000'

Created on 2018-07-03 by the reprex
package
(v0.2.0).

Edit:
I have since tested this with ggplot2 v3.0.0 from CRAN, same result.

Please also see the discussion on community.rstudio.com

@dpseidel dpseidel self-assigned this Jul 11, 2018

@dpseidel dpseidel added the bug label Jul 11, 2018

@dpseidel

This comment has been minimized.

Member

dpseidel commented Jul 16, 2018

So after some digging, I've figured out why this is the case, and interestingly, that it is log transform specific bug. It happens because regardless of the primary axes transformation, the secondary axis uses an identity transform, and thus when breaks are not specified, they are calculated by default using scales::extended_breaks() rather than scales::log_breaks() which is what log transformed axes use to calculate breaks behind the scenes. Other transformations (reverse, sqrt, boxcox, expr, atan, etc.. even log1p) are not affected because they all use extended_breaks() by default and thus there is no mismatch.

To demonstrate, I pulled the limits calculated internally for your secondary axis and show the difference between the breaks calculations below:

# secRng is equivalent to the limits of the secondary axis as calculated internally 
# by `get_breaks()` 
secRng <- c(8.52, 3.39e7)

# you can see now where those axis ticks are coming from!
scales::extended_breaks()(secRng)
#> [1] 0e+00 1e+07 2e+07 3e+07

# as compared to:
scales::log_breaks(base = 10)(secRng)
#> [1] 1e+00 1e+02 1e+04 1e+06 1e+08

For now at least, the work around (which I see you've already found on community.rstudio.com) is to specify your breaks, either for the primary axis (and set breaks = derive() in sec_axis() -- the default for dup_axis) or in the sec_axis() call explicitly. For example:

library(ggplot2)
set.seed(1234)

df <- data.frame(
  x = rep(c("A", "B", "C"), 100),
  y = sample(c(20, 20e6, 1e6, 50, 100), 300, replace = TRUE)
)

ggplot(data = df, aes(x, y)) +
  geom_boxplot() +
  scale_y_log10(
    labels = scales::dollar,
    breaks = c(1, 100, 1000, 1000, 10000, 10e6, 10e7),
    sec.axis = sec_axis(~.*0.85,
      breaks = derive(),
      labels = scales::dollar_format(suffix = "")
    )
  )

Interestingly, you might have noticed from the above exercise also, but setting breaks to call scales::log_breaks() explicitly (sec_axis will respect the function) doesn't quite duplicate the axis either, simply because the limits are slightly different after the inversion and retransform that happens in calculating the secondary axis:

ggplot(data = df, aes(x, y)) +
  geom_boxplot() +
  scale_y_log10(
    labels = scales::dollar,
    breaks = scales::log_breaks(),
    sec.axis = sec_axis(~. * .85,
      breaks = derive(),
      labels = scales::dollar_format(suffix = "")
    )
  )

Created on 2018-07-16 by the reprex
package
(v0.2.0).


@thomasp85 I'm curious how you would envision a best case fix for this. I can write some hack that sets better breaks for secondary axes if the primary is log transformed but that not a very elegant or extensible fix. If you have a few minutes to chat with me in the next week or two about potential ways forward, I'd be happy to take this on and potentially the other open sec_axis feature request for time and date transforms (#2244).

I did find a different bug when sorting this one out, that seems to have been introduced by PR #2095. Sampling from the transformed space rather than the original range causes the monotonicity test to fail regularly when using transforms like boxcox or sqrt, I think mostly/entirely due to default scale expansion. I trivially fixed this while doing my testing for this issue, and can fix it right away in a separate PR or simply wait and include it if we or I decide to tackle a larger refactoring of secondary axes in ggplot2. Let me know what you think.

@thomasp85

This comment has been minimized.

Member

thomasp85 commented Jul 16, 2018

I’m on vacation this week, but let’s chat next week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment