Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected (?) orientation with geom_col #3932

Closed
m1kejay opened this issue Apr 7, 2020 · 7 comments · Fixed by #3998
Closed

Unexpected (?) orientation with geom_col #3932

m1kejay opened this issue Apr 7, 2020 · 7 comments · Fixed by #3998

Comments

@m1kejay
Copy link

m1kejay commented Apr 7, 2020

I'm using ggplot2 v3.3.0.9000. The orientation of the geom_col bars is not correct when values of x are numeric and less than 1. Specifying orientation = "x" does what I would expect should be default behaviour. From reading the documentation (https://ggplot2.tidyverse.org/reference/geom_bar.html), it does state x is the default oreintation but that doesn't seem to be the case here.

library(tidyverse)

df <-
data.frame(
           x = c(0.1, 0.2, 0.3),
           y = c(10L, 20L, 30L)
      )
  
ggplot(df) +
  geom_col(aes(x = x, y = y))

ggplot(df) +
  geom_col(aes(x = x * 10, y = y))

ggplot(df) +
  geom_col(aes(x, y), orientation = "x")

@yutannihilation
Copy link
Member

Thanks, confirmed. This is because c(10, 20, 30) is considered more "discrete-like" than c(0.1, 0.2, 0.3) in our current logic. But, I'm not immediately sure if this should be flipped or not...

ggplot2/R/utilities.r

Lines 600 to 602 in bca6105

# Is there a single discrete-like position
y_is_int <- if (has_y) isTRUE(all.equal(y, round(y))) else FALSE
x_is_int <- if (has_x) isTRUE(all.equal(x, round(x))) else FALSE

@m1kejay
Copy link
Author

m1kejay commented Apr 8, 2020

Hmm interesting, thanks for the insight. I don't really have strong feelings either way (after writing the below.... maybe I do :) ). I'm not sure why 10, 20, 30 should be prioritised over 0.1, 0.2, 0.3. Here's a small sample of data that I was orignally trying to plot when I ran into this issue:

tibble::tribble(
    ~x,   ~y,
  0.05, 303L,
   0.1, 202L,
  0.15, 134L,
   0.2,  67L,
  0.25,  29L,
   0.3,  10L
  )

In this instance while x is not an integer it is an evenly spaced set unlike y. To me, it seems obvious what the orientation should be or at the very least, I think this should return an ambiguous result.

In fact, having looked at the source code you just linked, it seems the following lines do try to figure this out:

ggplot2/R/utilities.r

Lines 631 to 632 in bca6105

y_is_regular <- if (has_y && length(y_diff) != 0) all((y_diff / min(y_diff)) %% 1 < .Machine$double.eps) else FALSE
x_is_regular <- if (has_x && length(x_diff) != 0) all((x_diff / min(x_diff)) %% 1 < .Machine$double.eps) else FALSE

This seems to fail though with non-integer numbers (like mine):

x <- seq(0.05, 0.3, 0.05)
x_diff <- diff(sort(x))
x_diff <- x_diff[x_diff != 0]

all((x_diff / min(x_diff)) %% 1 < .Machine$double.eps)
#> [1] FALSE

# Integers
x <- seq(1, 6, 1)
x_diff <- diff(sort(x))
x_diff <- x_diff[x_diff != 0]

all((x_diff / min(x_diff)) %% 1 < .Machine$double.eps)
#> [1] TRUE

Could this be fixed by introducing some tolerance similiar to near() (https://dplyr.tidyverse.org/reference/near.html)?

x <- seq(0.05, 0.3, 0.05)

x_diff <- diff(sort(x))
x_diff <- x_diff[x_diff != 0]

all((x_diff / min(x_diff)) %% 1 < .Machine$double.eps^0.5)
#> [1] TRUE

@yutannihilation yutannihilation added this to the ggplot2 3.3.1 milestone Apr 9, 2020
@dmurdoch
Copy link
Contributor

This example on SO seems to choose non-deterministically: https://stackoverflow.com/q/61235860/2554330. I saw both of the plots in ggplot2 3.3.0; the only difference was what code I had run before. I can't reliably reproduce the desired plot (orientation = "x"), I usually get the other.

@m1kejay
Copy link
Author

m1kejay commented Apr 15, 2020

Is that SO yours? I have just tried executing/knitting that code many times and I consistently only get the incorrect orientiation (the output matched the second figure) or the correct orientiation if I add orientation = "x". But you see both orientations without modifying orientation?

From what I can tell, it seems that the example in the SO post suffers the same problem as my code above:

x <- c(-9.4, -9.3, -9.2, -9, -8.9, -8.8, -8.7, -8.5, -8.4, -8.3, 0)

x_diff <- diff(sort(x))
x_diff <- x_diff[x_diff != 0]

(x_diff / min(x_diff)) %% 1
#>  [1] 0.000000e+00 1.776357e-14 0.000000e+00 0.000000e+00 0.000000e+00
#>  [6] 1.776357e-14 0.000000e+00 0.000000e+00 0.000000e+00 2.984279e-13

all((x_diff / min(x_diff)) %% 1 < .Machine$double.eps)
#> [1] FALSE

all((x_diff / min(x_diff)) %% 1 < .Machine$double.eps^0.5)
#> [1] TRUE

@dmurdoch
Copy link
Contributor

I thought I did see both, but now when I re-run it, I can't get the "x" orientation again. What might have happened is that I accidentally edited the data to contain non-integer y values, then undid my edit.

My taste would run to a much simpler rule: if one axis is a factor, that determines orientation. If not, default to "x".

@yutannihilation
Copy link
Member

The SO says

I have noticed that after loading the tidyverse package, some versions are different. For example, in the first one ggplot is 3.2.1, while in the later it is 3.3.0.

So, I'm not sure if this is the same issue as you encountered.

My taste would run to a much simpler rule: if one axis is a factor, that determines orientation. If not, default to "x".

Yeah, true. The difficulty is that the current logic determines the orientation at the stage where there's no information about the type of axis...

@titaniumtroop
Copy link

It appears that repeated values of y will trigger orientation = "y" behavior (i.e., geom_col produces horizontal bars, rather than vertical), even if y is numeric. Reprex:

df <- data.frame(
  x = seq(from = 100, to = 300, length.out = 20), 
  y = (rep(6.5:10.5, 4)), 
  z = factor(c(rep(1, 5), rep(2, 5), rep(3, 5), rep(4, 5)))
)

ggplot2::ggplot(df, ggplot2::aes(x, y, fill = z)) +
  ggplot2::geom_col(position = "dodge")

Created on 2020-04-24 by the reprex package (v0.3.0)

According to the documentation for geom_col, this appears to be one of the rare events where determining orientation from aesthetic mapping fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants