Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simplify plotting barplot by group #1223

Closed
wants to merge 8 commits into from
Closed

simplify plotting barplot by group #1223

wants to merge 8 commits into from

Conversation

piever
Copy link
Contributor

@piever piever commented Aug 16, 2021

This allows specifying x_distance in a bar plot. It corresponds to the bar width + the x_gap. The rationale is that we compute this from the data, assuming that bar width plus x_gap should equal minimum(diffs(x)), but in categorical data we just want that to be 1 (otherwise things get problematic if in some group not all categories are present).

With this PR, AlgebraOfGraphics can simply set x_distance = 1 and all the user settings are automatically respected.

Example

julia> f = Figure()

julia> ax = Axis(f[1, 1]);

julia> barplot!(ax, [1, 4, 7], rand(3), x_distance = 1);

julia> barplot!(ax, [2, 3, 5, 6], rand(4), x_distance = 1);

julia> f

barplot

I'm actually wondering whether x_distance = 1 (rather than minimum(diffs(x))) is a good default when x is not a range, but maybe setting these defaults should be done on the AlgebraOfGraphics side.

@jkrumbiegel
Copy link
Collaborator

So this should supersede specifying width? I think it's a bit confusing to have three attributes that control just one thing really, which is bar width. Maybe we can simplify more?

@piever
Copy link
Contributor Author

piever commented Aug 16, 2021

I am not 100% happy with the status quo, where the user could set x_gap (proportion of empty space between bars) but not x_distance. In this scenario, the logic to set a default width for categorical variables in AoG becomes a bit complex. AoG should also look for user-passed x_gap and set width = 1 - x_gap, which could be a bit tricky if for example x_gap is passed in the BarPlot theme. Instead, just setting x_distance = 1 on the AoG side is much more robust.

Another approach to make this work more robustly for categorical plots would be to not make x_distance customisable (so we keep the API of 2.), but have it default to 1 when the x values are not an AbstractRange. So, barplot([1, 3, 7], rand(3)) would not guess width = (1 - x_gap) * 2, but width = (1 - x_gap) instead. Would that be OK?

In terms of simpler APIs, I imagine one could choose between:

  1. Just use width. To have more space between bars, the user just passes a smaller width
  2. Status quo (width and x_gap)
  3. This PR (width, x_gap, and x_distance)

Do you have preferences among these three options? 1. and 2. would also be fine from the AoG side if we implement the smarter x_distance default here (1 if x data is not a range).

@jkrumbiegel
Copy link
Collaborator

I think there should be two values. The distance between bars (or the base width for each bar) and a shrink factor. The shrink factor could be relative to base width or absolute, I don't know what's more useful in situations where bars have different widths (probably absolute is easier there) but relative would give a consistent look independent of base width, so it's probably a better default. (We could make the absolute / relative flag another attribute actually).

@piever
Copy link
Contributor Author

piever commented Aug 24, 2021

The distance between bars (or the base width for each bar) and a shrink factor. The shrink factor could be relative to base width or absolute.

I'd be in favor of relative shrink factor. So, assuming the user is doing barplot(1:3, rand(3)), the default should be

width = 1 (computed from the data if not provided)
shrink = 0.8

which results in actual width = 1 * 0.8, and the user should pass shrink = 1 to ensure that the user-passed width matches the drawn width?

There is also the dodge_gap = 0.03 attribute (multiply by 1-0.03) which may need to be turned to dodge_shrink = 1 - dodge_gap for consistency I guess.

@jkrumbiegel
Copy link
Collaborator

shrink = 0.8

Is it better to specify 0.8 or 0.2? I kind of tend towards 0.2, so the gap, because I guess having a gap is the reason to shrink the bar. But 0.8 would also be fine. Whatever we decide, the name should be chosen well. If we already have dodge_gap maybe it's better to stay with the gap terminology like we currently do.

@piever
Copy link
Contributor Author

piever commented Aug 25, 2021

Ok, so this PR should:

  1. rename x_gap to gap
  2. have width default to minimum(diff(x)) ("the maximum available space")
  3. draw bars with width width * (1 - gap)

Last doubt. I'm a bit unsure about minimum(diff(x)) as a default in case x is not a range but an arbitrary vector. Would it make sense to have simply width = 1 in that case (like we do for barplot and violin), and instead choose width = step(x) when x is a range? Or would that behavior be too subtle?

@jkrumbiegel
Copy link
Collaborator

for ranges, step(x) would be equal to minimum(diff(x)) no? I wonder if it would make sense to pick a vector of widths to account for cases in which the distances are different, but I'm not sure if that can be done well. I don't think minimum(diff) is too bad, probably is the right heuristic in common cases.

@piever
Copy link
Contributor Author

piever commented Aug 26, 2021

Yes, I agree, I've kept minimum(diff(x)) as the default width.

I've added a mention in the docstrings that the width of the bar is width * (1 - gap).

If the docs / tests build fine, and we think the docstrings / attribute names are clear enough, this should be good to go.

@SimonDanisch
Copy link
Member

Can we merge this?

@piever
Copy link
Contributor Author

piever commented Sep 29, 2021

I've just noticed that the Gantt example in the docs hadn't been updated for the x_gap -> gap rename, so I've fixed it in the last commit. This PR implements point 1,2,3 from #1223 (comment). If Julius is also happy with the API, I think we can go ahead and merge.

@SimonDanisch SimonDanisch mentioned this pull request Oct 19, 2021
@SimonDanisch
Copy link
Member

SimonDanisch commented Oct 19, 2021

merged in #1393

@SimonDanisch SimonDanisch deleted the pv/barwidth branch January 9, 2022 19:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants