Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Inert out-of-bounds function #255

Closed
teunbrand opened this issue Feb 15, 2020 · 12 comments
Closed

Feature request: Inert out-of-bounds function #255

teunbrand opened this issue Feb 15, 2020 · 12 comments
Labels
feature a feature request or enhancement

Comments

@teunbrand
Copy link
Contributor

I think it might be useful to have an out-of-bounds function similar to censor(), squish() and squish_infite() that does nothing to the data. The function can be very simple and trivial (probably could use a better name though):

inert_oob <- function(x, ...) x

It is very similar to base::identity and base::force but these don't take additional arguments and raise errors when used as oob-arguments in ggplot2. I'm sure a function like this must exist somewhere, but maybe my R vocabulary isn't large enough.

The use case for this in the ggplot2 package, would be to achieve the same what the coordinate limits do to the plot.

To demonstrate this, notice that the following plots are equivalent:

df <- data.frame(
  x = 1:3,
  y = c(1, 10, Inf)
)

ggplot(df, aes(x, y)) +
  geom_col() +
  coord_cartesian(ylim = c(0, 5))

image

ggplot(df, aes(x, y)) +
  geom_col() +
  scale_y_continuous(
    limits = c(0, 5),
    oob = inert_oob
  )

image

I've seen on quite a number of stack overflow posts that people are confused why they must use the coordinate limits and not the scale limits or can't figure out why their barplots dissappear when the lower-limit on y is > 0 (for example: https://stackoverflow.com/questions/58954151/problems-adapting-the-y-axis-to-2x2-anova-bargraph-using-r-and-ggplot/)

Formalising this behaviour of the scale limits as a function might help people better understand that not touching the data is a valid option for axis limits too.

If this function already exists somewhere, maybe it might be best to ignore this issue and I should ask the people at ggplot2 whether they would consider to document this use somewhere.

Thank you for taking the time to read!

@clauswilke
Copy link
Collaborator

This sounds like a good idea to me. Just needs a better name. leave_alone()? keep_all()?

@teunbrand
Copy link
Contributor Author

Those both seem better than inert_oob(). The censor/squish/squish_infinite seems to be verbs that describe what is being done to the out of bounds values, i.e.: censor out of bounds values, squish out of bounds values.

"keep_all out of bounds values" follows that pattern nicely too, while "leave_alone out of bounds values" flows somewhat more awkwardly.

To add a few suggestions: "tolerate out of bounds values", "permit out of bounds values", "allow out of bounds values", "ignore out of bounds values" and "preserve out of bounds values".

@clauswilke
Copy link
Collaborator

The package purrr has a function keep() that retains values based on a condition provided. The name keep_all() would be consistent with this. It's basically keep(..., .p = TRUE).

@hadley
Copy link
Member

hadley commented Apr 3, 2020

If we add more functions, I think we need a common prefix, i.e. oob_keep()

@hadley hadley added the feature a feature request or enhancement label Apr 3, 2020
@clauswilke
Copy link
Collaborator

@hadley In this case, would you rename the existing functions? Add versions with oob_ in front and soft-deprecate the existing ones? Keep both versions around?

@hadley
Copy link
Member

hadley commented Apr 3, 2020

Yeah, rename the old ones and just keep around. I doubt it's worth the hassle to deprecate them.

@clauswilke
Copy link
Collaborator

@teunbrand Do you want to amend your PR accordingly? Call your function oob_keep() and introduce variants with oob_ in front for all the other OOB functions?

@teunbrand
Copy link
Contributor Author

Yes, this sounds like a good idea. I'll adapt the PR.

@teunbrand
Copy link
Contributor Author

I've prefixed the squish()/squish_infinite()/censor()/discard() functions with oob_ and assigned the old names with the new function, .e.g: squish <- oob_squish with an @rdname tag referring to oob_squish. Also, I've renamed keep_all() to oob_keep().

Would it be worth adding a @family tag to these functions? The outlier is discard(), which isn't size stable, so it cannot used interchangeably with the rest.

@clauswilke
Copy link
Collaborator

I think adding a @family tag makes sense, and I would include oob_discard(). It's clearly part of the family, even if it behaves somewhat differently.

@teunbrand
Copy link
Contributor Author

Alright, the tags are added. Also, I took the liberty of adding a news bullet in case it was needed.

clauswilke pushed a commit that referenced this issue Apr 4, 2020
* Adds keep_all as bounds function.

* Prefix out of bounds functions with `oob_`.

* Bounds tests use `oob_`-prefixed functions.

* Re-oxygenate

* Add @family roxygen tags for `oob_*` functions

* Add NEWS.md bullet
@clauswilke
Copy link
Collaborator

Closed via #258.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants