Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn about discouraged aes() usage during plot build #3346

Merged

Conversation

paleolimbot
Copy link
Member

This is a PR that implements warnings when the layer mapping contains syntax like data$column instead of .data$column or (even better) column. This PR also implements an experimental and very open to discussion warning about layers that never map any columns from the data. Collectively, this is designed to encourage the correct usage of data and mappings, particularly in packages where this is sometimes done to avoid non-standard evaluation. This PR fixes #2693.

Warnings for $ and [[ usage in aes():

library(ggplot2)

build <- function(p) invisible(ggplot_build(p))

p <- ggplot(mpg, aes(cty, hwy)) + geom_point()

# these do not issue warnings
build(p)
build(p + aes(.data$cty))
build(p + aes(.data[["cty"]]))

# these issue warnings
build(p + aes(mpg$cty))
#> Warning: Use of `mpg$cty` is discouraged. Use `cty` instead.
build(p + aes(mpg[["cty"]]))
#> Warning: Use of `mpg[["cty"]]` is discouraged. Use `.data[["cty"]]`
#> instead.
mpg2 <- mpg
build(p + aes(mpg2$cty))
#> Warning: Use of `mpg2$cty` is discouraged. Use `cty` instead.

Created on 2019-05-30 by the reprex package (v0.2.1)

As discussed in #2693, it is probably going too far to check for zero column mappings. I implemented it here just for discussion...it causes 5 warnings in the test cases (all of which fit the description of zero mapped columns from data).

library(ggplot2)

build <- function(p) invisible(ggplot_build(p))

p <- ggplot(mpg, aes(cty)) + geom_histogram(bins = 30)

# these do not issue warnings
build(p)
build(p + aes(.data$cty))
build(p + aes(.data[["cty"]]))
build(p + aes(cty, fill = 1))

# this doesn't issue a warning about the mappings
build(p + aes(mpg$cty))
#> Warning: Use of `mpg$cty` is discouraged. Use `cty` instead.

# these issue warnings
build(p + aes(1))
#> Warning: Mapping contains zero mapped columns from data
build(p + aes(runif(nrow(mpg))))
#> Warning: Mapping contains zero mapped columns from data

Created on 2019-05-30 by the reprex package (v0.2.1)

@clauswilke
Copy link
Member

I don't see a good reason to issue warnings on zero mapped columns. I think it'll get in the way of meaningful use cases, in particular annotations.

library(ggplot2)
ggplot() + geom_text(aes(x = 1, y = 1, label = "annotation"))

Created on 2019-05-30 by the reprex package (v0.2.1)

@paleolimbot
Copy link
Member Author

I'm with you that it's probably too strict, although the example you posted doesn't issue a warning (because the layer data is empty). If the layer has data (which is still super common because there's no idomatic way to stop a layer from inheriting the plot data), the warning will fire:

library(ggplot2)
ggplot(mpg) + geom_text(aes(x = 1, y = 1, label = "annotation"))
#> Warning: Mapping contains zero mapped columns from data

ggplot(mpg) + geom_text(aes(x = 1, y = 1, label = "annotation"), data = data.frame())

Created on 2019-05-30 by the reprex package (v0.2.1)

@paleolimbot paleolimbot requested a review from hadley May 30, 2019 22:54
R/aes.r Outdated Show resolved Hide resolved
R/aes.r Outdated Show resolved Hide resolved
R/aes.r Outdated Show resolved Hide resolved
R/aes.r Outdated Show resolved Hide resolved
R/aes.r Outdated Show resolved Hide resolved
R/aes.r Outdated Show resolved Hide resolved
R/aes.r Outdated Show resolved Hide resolved
R/aes.r Outdated Show resolved Hide resolved
R/aes.r Outdated Show resolved Hide resolved
tests/testthat/test-aes.r Outdated Show resolved Hide resolved
tests/testthat/test-aes.r Outdated Show resolved Hide resolved
tests/testthat/test-aes.r Outdated Show resolved Hide resolved
@hadley
Copy link
Member

hadley commented Jun 4, 2019

Worth checking to see if this flags problems with any existing examples or vignettes.

And worth adding a sentence or two to the parameter documentation of aes(). Something like: Aesthetics are automatically looked for in the plot data; do not specify like aes(x = mydf$variable1)

…ent documentation in aes(), improved the name of the warning functions, improved function detecting whether or not an expression refers to the plot data
@paleolimbot
Copy link
Member Author

There are no problems in the examples/vignettes (as assessed by changing warning() to stop() and running CMD check). I opted to demonstrate correct usage in the aes() parameter documentation rather than discouraged usage - feel free to suggest a change.

R/aes.r Outdated
}

extract_target_is_likely_data <- function(x, data, env) {
if(!is.name(x[[2]])) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if(!is.name(x[[2]])) {
if (!is.name(x[[2]])) {

R/aes.r Outdated
#' @param x,y,... List of name-value pairs in the form `aesthetic = column_name`
#' describing which variables in the layer data should be mapped to which
#' aesthetics used by paired geom/stat. The names for x and y aesthetics are typically
#' omitted because they are so common; all other aesthetics must be named.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think it's worth being explicit about the wrong way to do things here. Maybe start a new paragraph and say something like: It's not necessary to refer to the dataset in aes(). Instead of writing aes(x = data$myvariable), all you need is aes(myvariable).

Maybe @mine-cetinkaya-rundel has thoughts on this?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @hadley that it's good to be explicit about what not to do, especially for those coming to ggplot2 from plot() where data$column_name usage is common. I might suggest matching the wording to the warning and say

"It's discouraged to refer to the data in aes(). Instead of aes(x = data$variable), use aes(variable)."

and similarly use aesthetic = variable earlier in the description (or use column or column_name or variable_name but keep it consistent throughout). I think it's more correct to avoid _name since it's not the name (character string) but the variable itself we're referring to, but I'm not sure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went for a mashup of both your suggestions and it feels much better now. Thank you!

@paleolimbot paleolimbot merged commit b560662 into tidyverse:master Jun 4, 2019
@paleolimbot paleolimbot deleted the issue-2693-extrat-usage-in-aes branch June 4, 2019 23:33
@lock
Copy link

lock bot commented Dec 1, 2019

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Dec 1, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Warn about use of $ in aes()
4 participants