Skip to content

Add asymmetrical expand argument to continuous and discrete scales #1669

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
huftis opened this issue Jul 15, 2016 · 29 comments
Closed

Add asymmetrical expand argument to continuous and discrete scales #1669

huftis opened this issue Jul 15, 2016 · 29 comments
Labels
feature a feature request or enhancement scales 🐍

Comments

@huftis
Copy link
Contributor

huftis commented Jul 15, 2016

This is a feature request for an asymmetrical expand argument to the scale_continuous and scale_discrete scales. A typical use case would be for bar plots that touch the x axis but still have some space above them. For example

ggplot(mtcars) + 
  geom_bar(aes(x = cyl))

has some space above the bars, but also an annoying spaces below it. Now,

ggplot(mtcars) + 
  geom_bar(aes(x = cyl)) +
  scale_y_continuous(expand = c(0, 0))

gets rid of the space below the bars, but also the space above the bars. To keep the space above the bars, one has to use a hack with a manually calculated value:

ggplot(mtcars) + 
  geom_bar(aes(x = cyl)) +
  scale_y_continuous(expand = c(0, 0)) +
  annotate("blank", x = 6, y = 14.7)

(And this doesn’t work well with more complicated plots, e.g. involving facets with free_y &c.)

Suggested feature/syntax: Change the expand argument from

expand = c(a, b)

to

expand = c(a, b, c, d)

where a is the multiplier for the lower limit, b is the additive term for the lower limit, and c and d are the corresponding multiplier and additive term for the upper limit. If c is missing, use the value of a, and if d is missing, use the value of b. This way, all old code would continue to work.

See http://stackoverflow.com/questions/34623780/asymmetric-expansion-of-limits-ggplot2-2-0 for more information. This post also has suggested code for an asymmetrical expand feature, but this does not work for complicated facet grids, nor with coord_flip().

@steveharoz
Copy link
Contributor

Have a look at the expand_limits function

ggplot(mtcars) + 
  geom_bar(aes(x = cyl), width = 1) +
  scale_y_continuous(expand = c(0,0)) +
  expand_limits(y=25)

@huftis
Copy link
Contributor Author

huftis commented Jul 15, 2016

@steveharoz expand_limits() is a wrapper for geom_blank(), but still requires one to manually calculate the limits. This feature request is for an expansion of expand to handle separate values for the lower and upper limits, for automatic determining the correct limits. For this example, the new syntax would be:

ggplot(mtcars) +
  geom_bar(aes(x = cyl)) +
  scale_y_continuous(expand = c(0, 0, 0.05, 0))

@huftis
Copy link
Contributor Author

huftis commented Jul 17, 2016

This doesn’t seem too difficult to add. I’ll try to create a patch and a pull request.

huftis added a commit to huftis/ggplot2 that referenced this issue Jul 17, 2016
The `expand` argument for `scale_*_continuous()` and
`scale_*_discrete()` now accepts separate expansion
constants for the lower and upper range limits.

This makes it much easier to create bar charts where the
bottom of the bars are flush with the x axis but the bars
still have some (automatically calculated amount of) space
above them:

    ```R
    ggplot(mtcars) +
    geom_bar(aes(x = cyl)) +
    scale_y_continuous(expand = c(0, 0, 0.05, 0))
    ```

The syntax for the multiplicative and additive expansion
constants has been changed from `c(m, a)` to
`c(m_lower, a_lower, m_uppper, a_upper)`. The old syntax
will still work, as length 2 vectors `c(m, a)` are
expanded to `c(m, a, m, a)` and length 3 vectors
are expanded from `c(m1, a1, m2)` to `c(m1, a2, m2, a1)`.
(@huftis, tidyverse#1669)
@hadley
Copy link
Member

hadley commented Jul 28, 2016

Hmmm, I'm not convinced I want this. Would you mind including a couple of before and after plots?

@huftis
Copy link
Contributor Author

huftis commented Aug 1, 2016

@hadley Sure.

First, note that this does not change the default look of ggplot2 graphs at all. So a simple before and after plot would look identical.

It gives the user the option of asymmetrical automatic range expansion for the x and y scales. This is most useful for bar charts (but there may be other uses). In almost all programs, the bars are attached to an axis, with no gap. Illustration: https://www.google.no/search?q=bar+chart&tbm=isch

However, in ggplot2 there is a gap, making the bars hover above the x axis:

ggplot2-default

or the y axis:

ggplot2-flip

One can use the expand argument to remove the gap (scale_*_continuous(expand = c(0,0)), but this also removes the gap above / to the right of the bars, which looks bad:

ggplot2-expand-sym

One can use a geom_blank() hack or expand_limits(), but then one have to manually calculate the new limits. And it doesn’t work with faceting, so is of limited value.

This pull request adds a feature for asymmetric range expansion. By default (and if one only specifies a length 2 expansion), everything works as before, with symmetric range expansion. But one can now also specify separate expansion constants for the lower and upper limits. This can be used to remove the gap in bar charts (while still leaving some space above / to the right of the bars):

ggplot2-expand-asym

I think many will find this a useful addition to ggplot2. Evidence:

  1. https://stackoverflow.com/questions/22480052/how-to-expand-axis-asymmetrically-with-ggplot2-without-setting-limits-manually?noredirect=1&lq=1
  2. https://stackoverflow.com/questions/34623780/asymmetric-expansion-of-limits-ggplot2-2-0
  3. https://stackoverflow.com/questions/34342386/pad-expand-only-the-top-of-continuous-scale-in-ggplot2?noredirect=1&lq=1

(The last link illustrates a use case for a different type of graph than a bar chart.)

@hadley
Copy link
Member

hadley commented Aug 2, 2016

Hmmmm, that looks really weird to my eyes. I think ideally you want to have about the same margin in each dimension. The default ggplot2 margins aren't perfect (because of the additive/multiplicative) nature, but this makes it even worse.

@thomasp85
Copy link
Member

I personally think it has some merit for bar charts, but agree that it looks weird with the default theme. A style where only the y-axis is drawn and with a white background serves it better...

@huftis
Copy link
Contributor Author

huftis commented Aug 2, 2016

Yes, that wasn’t an attempt to make a ‘perfect bar chart’ at all. (Firstly, I wouldn’t use the default theme, but typically one with a transparent background. And I would get rid of the tick marks. And the horizontal grid lines. And add some (symmetrical!) space (using expand) on the other axis. And follow Stephen Few’s latest recommendations on bar widths and the spaces between them (and yes, he does use asymmetrical ‘range expansion’).)

@thomasp85
Copy link
Member

@hadley Can we either close or discuss the preferred API for setting asymmetric expansion. I'm personally for this feature for the single use case of bar charts (have personally done it manually with expand_limits a couple of times), but it is added complexity...

@hadley
Copy link
Member

hadley commented Aug 23, 2016

It still looks weird to me, but I'd accept a PR that fixed it. @huftis would you be willing take another shot at it? To produce cleaner code, you'll need to setup a basic system of S3 classes that insulate the complexity of the new approach into a central location.

@huftis
Copy link
Contributor Author

huftis commented Aug 28, 2016

@hadley Sure, I’ll take a new look at creating a PR.

@hadley
Copy link
Member

hadley commented Sep 23, 2016

@huftis are you still interested? We'll be making a release candidate of ggplot2 in one week, so it'll need to be done by then if you want it in the next version of ggplot2.

@huftis
Copy link
Contributor Author

huftis commented Sep 23, 2016

Yes, still interested. Been a bit busy lately, but I’ll try to find time to get it done in the coming week.

@huftis
Copy link
Contributor Author

huftis commented Sep 29, 2016

@hadley I have a simplified (mostly) working version, but I’m really puzzled by some existing code in scale-discrete-.r (https://github.com/hadley/ggplot2/blob/master/R/scale-discrete-.r). This maintains separate continuous and discrete scales, and the dimension() function at around line 109 does separate calculations for discrete and continuous scales. That’s OK. But for a combined continuous/discrete scale (which is what you actually get even for a simple ggplot(mtcars, aes(x = factor(cyl))) + geom_bar()), the following is returned:

range(
  expand_range(c_range, expand[1], 0 , 1),
  expand_range(c(1, length(d_range)), 0, expand[2], 1)
)

My understanding of this is: First an expanded range is calculated for the continuous scale, but only using the multiplicative constant (i.e. setting the additive constant to zero). And then an expanded range is calculated for the discrete scale, but only using the additive constant (i.e. setting the multiplicative constant to zero). And then the interval enclosing both these ranges is returned. What is the logic behind this?

@thomasp85
Copy link
Member

Try to update to the latest version - this should have been fixed...

@huftis
Copy link
Contributor Author

huftis commented Sep 29, 2016

@thomasp85 I am using the latest version. That’s where I copied this piece of code from.

@thomasp85
Copy link
Member

Oh, sorry - was thinking about something else slightly related (a few lines up)

@thomasp85
Copy link
Member

Actually it is related - it is an oversight from when we fixed the other... Thanks for catching that

@huftis
Copy link
Contributor Author

huftis commented Sep 29, 2016

I think I understand it a bit better, but I still think it looks strange. For

ggplot(mtcars, aes(x = factor(cyl))) + geom_bar()

there will be three vertical bars, and c(1, length(d_range)) will be c(1,3). The continuous scale takes the bar widths into account, and so will be c(1-.9/2, 3+.9/2) == c(0.55, 3.45) (with the default bar width of .9). The range displayed must cover both, with a little extra space on each side.

Now, internally, this will be a discrete scale, so will have the default expansion of c(0, .6), i.e. add .6 on each side, to get c(0.4, 3.6), which is guaranteed to cover the bars for normal bar charts (since the bars have at most bar width equal to 1). The expand_range(c(1, length(d_range)), 0, expand[2], 1) code actually does this if the user hasn’t specified an expand argument. If the user has specified an expand argument, only the additive part is used (expand[2]).

For the continuous part, the calculation is expand_range(c_range, expand[1], 0 , 1). By default, the multiplicative part in c(0, .6) is 0, so this has no effect at all. But if a custom expand argument is used, the multiplicative part is used, while the additive part is ignored.

I guess this can be useful for data like

ggplot(mtcars, aes(factor(cyl), mpg)) + geom_jitter()

where expand_range(c_range, expand[1], 0 , 1) ensures that all the values are shown. But it would work just as well if it was replaced by just c_range. Note that in both cases parts of the data points may be hidden, e.g. for:

ggplot(mtcars, aes(factor(cyl), mpg)) + geom_jitter(width=10)

I guess what you really want is to have separate expand values for the discrete and continuous geoms. So that for instance the bar geom in a bar plot will have (by default) a .6 units space on either side while the point geom would have (by default) 5% extra space on either side. But this is not currently possible, since 1) a ‘combined’ scale is internally a discrete scale, and is only passed the c(0, .6) expand argument by default, not the c(0.05, 0) default expand argument used for continuous scales, 2) it’s not possible for the user to supply separate expand_cont and expand_discrete values for the two different types of scales.

However, if one wants to use a multiplicative constant for the continuous part (but no additive constant) and an additive constant for the discrete part (but no multiplicative constant), one can use the (undocumented) workaround scale_x_discrete(expand=c(.05, .6)). And I guess that’s the idea behind the code(?).

huftis added a commit to huftis/ggplot2 that referenced this issue Oct 1, 2016
The `expand` argument for `scale_*_continuous()` and `scale_*_discrete()`
now accepts separate expansion constants for the lower and upper range limits.

This is useful for creating bar charts where the bottom of the bars
are flush with the x axis but the bars still have some (automatically
calculated amount of) space above them:

```R
ggplot(mtcars) +
  geom_bar(aes(x = factor(cyl))) +
  scale_y_continuous(expand = c(0, 0, 0.1, 0))
```

It can also be useful for line charts, e.g. for counts over time,
where one wants to have a ’hard’ lower limit of y = 0, but leave the
upper limit unspecified (and perhaps differing between panels),
but with some extra space above the highest point on the line.
(With symmetrical limits, the extra space above the highest point
could cause the lower limit to be negative.)

The syntax for the multiplicative and additive expansion
constants has been changed from `c(m, a)` to
`c(m_lower, a_lower, m_uppper, a_upper)`. The old syntax will still
work, as length 2 vectors `c(m, a)` are expanded to `c(m, a, m, a)`
and length 3 vectors are expanded from `c(m1, a1, m2)` to
`c(m1, a2, m2, a1)`. (@huftis, tidyverse#1669)
@huftis
Copy link
Contributor Author

huftis commented Oct 1, 2016

I’ve now added a new pull request. The code has been simplified, and all the calculations now happen in a expand_range4() utility function. Most of the diff is just changes in the documentation. :)

@dzion
Copy link

dzion commented Oct 27, 2016

Has this somehow been implemented yet? I generally agree with @huftis on the usefulness in bar plots, but also with @hadley on the inelegance of the solution provided here....
May I suggest to solve this analogous to the limits() solution, where NA can be specified as one of the parameters. This would provide perfect downward compatibility and is very easy to understand rather than adding 2 additional parameters to expand()
Providing NA would allow for R to use the default for one of the expand() parameters, while the other one could be specified. This would allow for automatic rescaling of one of the limits, while setting the other one to a fixed valued.
Unfortunately I am pretty busy right now otherwise I would take a stab at this myself.

@huftis
Copy link
Contributor Author

huftis commented Oct 27, 2016

@dzion I haven’t have time to implement it yet, but will do so (though obviously not in time for the next ggplot2 release). It will probably have the syntax expand = expand_scale(mult=c(0, .1), add=c(0,0)) or something similar (which could be easily extended if ggplot2 is ever updated to support three scales/axes, i.e. 3D plots).

@dzion
Copy link

dzion commented Oct 27, 2016

@huftis Don't you feel that simply adding the NA option would provide a good enough workaround, without adding further options, that would generally solve the problem in many of the cases?
I agree, that specifying a multiplicative value for expand instead of an additive value is more straight forward and for my case usually what is needed, but sometimes simply providing an absolute value (seems to be the norm for ggplot2 in other instances) can also be nice - thus I like your solution a lot!
However if you are implementing your solution and find the time, I think implementing an NA option in expand() could benefit users as well, as it is easy to understand and in line with currently existing syntax.
Thanks for tackling this issue

@thomasp85
Copy link
Member

I would suggest adding the parameters expand_lower and expand_upper to the scale constructor. If provided they take precedence over the value passed to expand and you can provide only one of them to only override one side. This will also ensure backward compatibility...

@hadley
Copy link
Member

hadley commented Oct 27, 2016

We already have a reasonable API (that I like) in the PR, so I don't think we need to discuss it further.

@thomasp85
Copy link
Member

Ok - didn't know the API was settled. I'll show myself out then🙂

@hadley hadley added feature a feature request or enhancement scales 🐍 labels Jan 25, 2017
@Gnossos
Copy link

Gnossos commented Mar 14, 2017

I think similar issues arise with area plots. They should sit on the horizontal axis and kiss the left and right borders. One can do this with the current expand(), but then the area plots bump their heads.

@dhimmel
Copy link

dhimmel commented May 3, 2017

I'm also excited for the bar plot use case. For example, I'd love to expand every side besides the bottom in the plot below:

barplot-expand

Thanks @huftis and @hadley for their work on #1805. In the meantime, is there a workaround for faceted plots?

hadley pushed a commit that referenced this issue Jul 10, 2017
…#1805)

* Allow separate expansion values for lower and upper range limits.

The `expand` argument for `scale_*_continuous()` and `scale_*_discrete()`
now accepts separate expansion constants for the lower and upper range limits.

This is useful for creating bar charts where the bottom of the bars
are flush with the x axis but the bars still have some (automatically
calculated amount of) space above them:

```R
ggplot(mtcars) +
  geom_bar(aes(x = factor(cyl))) +
  scale_y_continuous(expand = c(0, 0, 0.1, 0))
```

It can also be useful for line charts, e.g. for counts over time,
where one wants to have a ’hard’ lower limit of y = 0, but leave the
upper limit unspecified (and perhaps differing between panels),
but with some extra space above the highest point on the line.
(With symmetrical limits, the extra space above the highest point
could cause the lower limit to be negative.)

The syntax for the multiplicative and additive expansion
constants has been changed from `c(m, a)` to
`c(m_lower, a_lower, m_uppper, a_upper)`. The old syntax will still
work, as length 2 vectors `c(m, a)` are expanded to `c(m, a, m, a)`
and length 3 vectors are expanded from `c(m1, a1, m2)` to
`c(m1, a2, m2, a1)`. (@huftis, #1669)

* Added `expand_scale()` function for easier generation of scale expansion
vectors.

Instead of having to manually specify an `expand` argument using
a somewhat confusing syntax (a vector of 2, 3 or 4 numeric values),
it’s now possible to use the user-friendly (and documented)
`expand_scale()` function.

This commit also cleans up the documentation related to the
`expand` argument, which was duplicated in several functions.

* Added UTF-8 character encoding declaration to DESCRIPTION.

The documentation for one of the functions had a no-breaking space,
(between a number and the word ‘units’), which caused R CMD check to
complain about ‘non-ASCII input and no declared encoding’.
This adds a character encoding declaration of UTF-8 to the DESCRIPTION
file to fix this problem.

* Fixed some style issues.

* Updated and regenerated documentation.

* Specify character encoding used for documentation.

* Minor grammar improvement in documentation.

* Don’t generate documentation for internal function expand_range4().
karawoo pushed a commit to karawoo/ggplot2 that referenced this issue Jul 14, 2017
…tidyverse#1805)

* Allow separate expansion values for lower and upper range limits.

The `expand` argument for `scale_*_continuous()` and `scale_*_discrete()`
now accepts separate expansion constants for the lower and upper range limits.

This is useful for creating bar charts where the bottom of the bars
are flush with the x axis but the bars still have some (automatically
calculated amount of) space above them:

```R
ggplot(mtcars) +
  geom_bar(aes(x = factor(cyl))) +
  scale_y_continuous(expand = c(0, 0, 0.1, 0))
```

It can also be useful for line charts, e.g. for counts over time,
where one wants to have a ’hard’ lower limit of y = 0, but leave the
upper limit unspecified (and perhaps differing between panels),
but with some extra space above the highest point on the line.
(With symmetrical limits, the extra space above the highest point
could cause the lower limit to be negative.)

The syntax for the multiplicative and additive expansion
constants has been changed from `c(m, a)` to
`c(m_lower, a_lower, m_uppper, a_upper)`. The old syntax will still
work, as length 2 vectors `c(m, a)` are expanded to `c(m, a, m, a)`
and length 3 vectors are expanded from `c(m1, a1, m2)` to
`c(m1, a2, m2, a1)`. (@huftis, tidyverse#1669)

* Added `expand_scale()` function for easier generation of scale expansion
vectors.

Instead of having to manually specify an `expand` argument using
a somewhat confusing syntax (a vector of 2, 3 or 4 numeric values),
it’s now possible to use the user-friendly (and documented)
`expand_scale()` function.

This commit also cleans up the documentation related to the
`expand` argument, which was duplicated in several functions.

* Added UTF-8 character encoding declaration to DESCRIPTION.

The documentation for one of the functions had a no-breaking space,
(between a number and the word ‘units’), which caused R CMD check to
complain about ‘non-ASCII input and no declared encoding’.
This adds a character encoding declaration of UTF-8 to the DESCRIPTION
file to fix this problem.

* Fixed some style issues.

* Updated and regenerated documentation.

* Specify character encoding used for documentation.

* Minor grammar improvement in documentation.

* Don’t generate documentation for internal function expand_range4().
@zgzhao
Copy link

zgzhao commented Sep 26, 2017

Add a blank geom_text layer to expand the upper limit of barchart. For example:

ggplot(mtcars, aes(x=factor(cyl), y=mpg)) + geom_bar(stat="identity", position="dodge") + facet_wrap(~gear, scales="free_y") + scale_y_continuous(expand = c(0, 0)) + geom_text(aes(y=mpg * 1.1, label=""))

@hadley hadley closed this as completed Oct 30, 2017
@lock lock bot locked as resolved and limited conversation to collaborators Jun 18, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature a feature request or enhancement scales 🐍
Projects
None yet
Development

No branches or pull requests

8 participants