New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

size in geom_point using variable with all but 1 as NAs #985

Closed
iagomosqueira opened this Issue Jul 10, 2014 · 3 comments

Comments

Projects
None yet
2 participants
@iagomosqueira

iagomosqueira commented Jul 10, 2014

When passing a variable in the data.frame to the size argument in geom_point, if the column to be all but one as NA, all values get plotted equally, i.e. NAs are not dropped. If two non-NA values exist, the plot looks as expected.

Wanted to know if it is an unavoidable consequence of how size is calculated, or really a bug. It the later I could try finding/fixing it.

  df <- data.frame(year=rep(10:14, each=3), age=1:3, data=runif(15))

  # points sized as expected
  ggplot(df, aes(year, age)) + geom_point(aes(size=data))

  # convert all values but 2 into NAs
  df$data[1:13] <- NA
  #2 points sized and shown
  ggplot(df, aes(year, age)) + geom_point(aes(size=data), na.rm=TRUE)

  # convert all but one value into NA
  df$data[1:14] <- NA

  # plot shows equal-sized dots for all datapoints
  ggplot(df, aes(year, age)) + geom_point(aes(size=data), na.rm=TRUE)

I tried with different names for the data column, but it made no difference. Tested in both ggplot2 1.0.0 from CRAN and 1.0.0.99 from github.

Thanks

@hadley

This comment has been minimized.

Member

hadley commented Jun 12, 2015

That is a strange one. I'll take a look. Even more minimal reprex:

df <- data.frame(x = rep(1:2, each = 2), y = rep(1:2, 2), z = c(1, NA, NA, NA))
ggplot(df, aes(x, y, size = z)) + geom_point()
@hadley

This comment has been minimized.

Member

hadley commented Jul 23, 2015

Ah, this is a bug in scales::rescale():

function(x, to = c(0, 1), from = range(x, na.rm = TRUE)) {
  if (zero_range(from) || zero_range(to)) return(rep(mean(to), length(x)))

  (x - from[1]) / diff(from) * diff(to) + to[1]
}

If from has zero_range(), it needs to preserve missing values in x.

@hadley hadley closed this in 0cf9a76 Jul 23, 2015

@hadley

This comment has been minimized.

Member

hadley commented Jul 23, 2015

@lock lock bot locked as resolved and limited conversation to collaborators Jun 19, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.