Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn about discouraged aes() usage during plot build #3346

Merged

Conversation

@paleolimbot
Copy link
Collaborator

@paleolimbot paleolimbot commented May 30, 2019

This is a PR that implements warnings when the layer mapping contains syntax like data$column instead of .data$column or (even better) column. This PR also implements an experimental and very open to discussion warning about layers that never map any columns from the data. Collectively, this is designed to encourage the correct usage of data and mappings, particularly in packages where this is sometimes done to avoid non-standard evaluation. This PR fixes #2693.

Warnings for $ and [[ usage in aes():

library(ggplot2)

build <- function(p) invisible(ggplot_build(p))

p <- ggplot(mpg, aes(cty, hwy)) + geom_point()

# these do not issue warnings
build(p)
build(p + aes(.data$cty))
build(p + aes(.data[["cty"]]))

# these issue warnings
build(p + aes(mpg$cty))
#> Warning: Use of `mpg$cty` is discouraged. Use `cty` instead.
build(p + aes(mpg[["cty"]]))
#> Warning: Use of `mpg[["cty"]]` is discouraged. Use `.data[["cty"]]`
#> instead.
mpg2 <- mpg
build(p + aes(mpg2$cty))
#> Warning: Use of `mpg2$cty` is discouraged. Use `cty` instead.

Created on 2019-05-30 by the reprex package (v0.2.1)

As discussed in #2693, it is probably going too far to check for zero column mappings. I implemented it here just for discussion...it causes 5 warnings in the test cases (all of which fit the description of zero mapped columns from data).

library(ggplot2)

build <- function(p) invisible(ggplot_build(p))

p <- ggplot(mpg, aes(cty)) + geom_histogram(bins = 30)

# these do not issue warnings
build(p)
build(p + aes(.data$cty))
build(p + aes(.data[["cty"]]))
build(p + aes(cty, fill = 1))

# this doesn't issue a warning about the mappings
build(p + aes(mpg$cty))
#> Warning: Use of `mpg$cty` is discouraged. Use `cty` instead.

# these issue warnings
build(p + aes(1))
#> Warning: Mapping contains zero mapped columns from data
build(p + aes(runif(nrow(mpg))))
#> Warning: Mapping contains zero mapped columns from data

Created on 2019-05-30 by the reprex package (v0.2.1)

@clauswilke
Copy link
Member

@clauswilke clauswilke commented May 30, 2019

I don't see a good reason to issue warnings on zero mapped columns. I think it'll get in the way of meaningful use cases, in particular annotations.

library(ggplot2)
ggplot() + geom_text(aes(x = 1, y = 1, label = "annotation"))

Created on 2019-05-30 by the reprex package (v0.2.1)

@paleolimbot
Copy link
Collaborator Author

@paleolimbot paleolimbot commented May 30, 2019

I'm with you that it's probably too strict, although the example you posted doesn't issue a warning (because the layer data is empty). If the layer has data (which is still super common because there's no idomatic way to stop a layer from inheriting the plot data), the warning will fire:

library(ggplot2)
ggplot(mpg) + geom_text(aes(x = 1, y = 1, label = "annotation"))
#> Warning: Mapping contains zero mapped columns from data

ggplot(mpg) + geom_text(aes(x = 1, y = 1, label = "annotation"), data = data.frame())

Created on 2019-05-30 by the reprex package (v0.2.1)

@paleolimbot paleolimbot requested a review from hadley May 30, 2019
R/aes.r Outdated Show resolved Hide resolved
R/aes.r Outdated Show resolved Hide resolved
R/aes.r Outdated Show resolved Hide resolved
R/aes.r Outdated Show resolved Hide resolved
R/aes.r Outdated Show resolved Hide resolved
R/aes.r Outdated Show resolved Hide resolved
R/aes.r Outdated Show resolved Hide resolved
R/aes.r Outdated Show resolved Hide resolved
R/aes.r Outdated Show resolved Hide resolved
tests/testthat/test-aes.r Outdated Show resolved Hide resolved
tests/testthat/test-aes.r Outdated Show resolved Hide resolved
tests/testthat/test-aes.r Outdated Show resolved Hide resolved
@hadley
Copy link
Member

@hadley hadley commented Jun 4, 2019

Worth checking to see if this flags problems with any existing examples or vignettes.

And worth adding a sentence or two to the parameter documentation of aes(). Something like: Aesthetics are automatically looked for in the plot data; do not specify like aes(x = mydf$variable1)

…ent documentation in aes(), improved the name of the warning functions, improved function detecting whether or not an expression refers to the plot data
@paleolimbot
Copy link
Collaborator Author

@paleolimbot paleolimbot commented Jun 4, 2019

There are no problems in the examples/vignettes (as assessed by changing warning() to stop() and running CMD check). I opted to demonstrate correct usage in the aes() parameter documentation rather than discouraged usage - feel free to suggest a change.

@hadley
hadley approved these changes Jun 4, 2019
R/aes.r Outdated
}

extract_target_is_likely_data <- function(x, data, env) {
if(!is.name(x[[2]])) {

This comment has been minimized.

@hadley

hadley Jun 4, 2019
Member

Suggested change
if(!is.name(x[[2]])) {
if (!is.name(x[[2]])) {
R/aes.r Outdated
#' @param x,y,... List of name-value pairs in the form `aesthetic = column_name`
#' describing which variables in the layer data should be mapped to which
#' aesthetics used by paired geom/stat. The names for x and y aesthetics are typically
#' omitted because they are so common; all other aesthetics must be named.

This comment has been minimized.

@hadley

hadley Jun 4, 2019
Member

I do think it's worth being explicit about the wrong way to do things here. Maybe start a new paragraph and say something like: It's not necessary to refer to the dataset in aes(). Instead of writing aes(x = data$myvariable), all you need is aes(myvariable).

Maybe @mine-cetinkaya-rundel has thoughts on this?

This comment has been minimized.

@mine-cetinkaya-rundel

mine-cetinkaya-rundel Jun 4, 2019
Contributor

I agree with @hadley that it's good to be explicit about what not to do, especially for those coming to ggplot2 from plot() where data$column_name usage is common. I might suggest matching the wording to the warning and say

"It's discouraged to refer to the data in aes(). Instead of aes(x = data$variable), use aes(variable)."

and similarly use aesthetic = variable earlier in the description (or use column or column_name or variable_name but keep it consistent throughout). I think it's more correct to avoid _name since it's not the name (character string) but the variable itself we're referring to, but I'm not sure.

This comment has been minimized.

@paleolimbot

paleolimbot Jun 4, 2019
Author Collaborator

I went for a mashup of both your suggestions and it feels much better now. Thank you!

… documentation for aes()
@hadley
hadley approved these changes Jun 4, 2019
@paleolimbot paleolimbot merged commit b560662 into tidyverse:master Jun 4, 2019
4 checks passed
4 checks passed
codecov/patch 96% of diff hit (target 78.91%)
Details
codecov/project 79.16% (+0.24%) compared to 1f6f0cb
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@paleolimbot paleolimbot deleted the paleolimbot:issue-2693-extrat-usage-in-aes branch Jun 4, 2019
@lock
Copy link

@lock lock bot commented Dec 1, 2019

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Dec 1, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

4 participants
You can’t perform that action at this time.