Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stacked Area Charts #2960

Merged
merged 14 commits into from Sep 7, 2018

Conversation

Projects
None yet
3 participants
@alexcjohnson
Copy link
Contributor

commented Sep 1, 2018

We held out a long time on this one, but stacked area charts are finally coming to plotly.js.

The API is as discussed in #1217:

  • Provide matching stackgroup attributes to some scatter traces and they become a stack.
  • There are no plot-wide stacking attributes; stack-wide attributes are in the trace definitions, and we'll take a value for each attribute from the first trace in the stack that contains that attribute, visible or not (so the stack doesn't fall apart if you hide the first trace). This is different from, and more powerful than, how we describe bar stacking/grouping - and should be reviewed with an eye toward eventually using a similar framework for bars.
  • The data for all traces in the stack are sorted by position, and gaps in each trace are filled in either with zeros or interpolations
  • The one item from #1217 I did not include here is stackgaps: 'interrupt'. That's going to require some finicky drawing code, particularly if we want to support arbitrary line.shape so I'll leave it for later. But 'infer zero' (default) and 'interpolate' are included here.
  • Another open item is to improve hover info. What I did here matches stacked bars, but both of them, particularly if you normalize the results, would benefit from more options - normalized vs raw data, (sub)totals.

In order to make it work well in various edge cases I made a number of preparatory changes:

  • Lib.sort c87ccb3 wraps the built-in Array.sort with a check for whether the array is already perfectly sorted (or perfectly reversed), that for arrays of length 1e5+ can be a 10x or better speedup for already-sorted arrays, and should have very little penalty for unsorted arrays. For stacked area I expect the vast majority of the time the data will already be sorted, so that's why I implemented this now and this is the only place I used it, but I bet there are other places it would be useful as well.
  • Some edge case improvements in autorange 1f4898c - I changed a few baseline images (and one mock), I hope you'll agree these were actually incorrect before.
  • Better ordering of hover labels when traces have matching data (such as in a bar or area stack when one trace is zero) - try to preserve the stacking order 2fde3dc
  • Continue lines off the edge toward invalid log values 68b489d - I think I hadn't done this before (for scatter) out of caution lest we draw something misleading, I opted to just not draw the line at all. But particularly with fills, and even more so with stacked fills, this gets confusing and misleading as the fills would just connect across the missing point(s). I opted to draw these lines straight toward the edge if one dimension went invalid (since in principle they're going infinitely far away) or at a slope of 1 on a log/log plot if both dimensions go invalid simultaneously. Note that there are cases here where a separate point will move across these lines if you flip between linear and log axes, but that was already possible with finite data; this is just an extreme case of the same. (note the axes_range_type baseline change belongs in this commit but I put it in the autorange commit instead)

cc @etpinard @antoinerg @nicolaskruchten

@@ -75,7 +75,6 @@
{
"x": [1.5],
"y": [1.25],
"fill": "tonexty",

This comment has been minimized.

Copy link
@alexcjohnson

alexcjohnson Sep 1, 2018

Author Contributor

Oh this change I think is actually required due to a change in the stacked area commit be38e93#diff-33c02cd37e7a4c951059a3c93221ac4eR175 - we were accidentally treating a length-1 trace as filling to itself (since its start and end points are the same!) but we shouldn't do that... therefore this trace, since it's the first on its subplot, should interpret 'tonexty' as 'tozeroy'.

var subplotAndType = trace.xaxis + trace.yaxis + trace.type;
var firstScatter = fullLayout._firstScatter;
if(!firstScatter[subplotAndType]) firstScatter[subplotAndType] = trace.uid;
}

This comment has been minimized.

Copy link
@alexcjohnson

alexcjohnson Sep 1, 2018

Author Contributor

In fact, scatter_fill_corner_cases top subplots were also prevented from filling to zero with 'tonexty' because only one subplot could have the "first scatter" trace on it. This commit 🔪 gd.firstscatter and replaces it with one trace (uid) per subplot, attached to fullLayout. The stacked area mocks 🔒 this.

This comment has been minimized.

Copy link
@etpinard

etpinard Sep 4, 2018

Member

Nice. This is probably the most underrated piece in this PR. I always found that gd.firstscatter less-than-ideal. This is a welcome improvement.

if(cd.length !== serieslen) {
// TODO: verify this never happens and remove
throw new Error('length mismatch!');
}

This comment has been minimized.

Copy link
@alexcjohnson

alexcjohnson Sep 1, 2018

Author Contributor

Oops, missed this one... well, when I test ^^ I'll have plenty of confidence to remove it 😄

This comment has been minimized.

Copy link
@alexcjohnson

alexcjohnson Sep 5, 2018

Author Contributor

🔪 in 8547cf8

// if we're stacking, "infer zero" gap mode gets markers in the
// gap points - because we've inferred a zero there - but other
// modes (currently "interpolate", later "interrupt" hopefully)
// we don't draw generated markers

This comment has been minimized.

Copy link
@alexcjohnson

alexcjohnson Sep 1, 2018

Author Contributor

@etpinard @nicolaskruchten do you agree with this choice? It only applies to points we generate in one trace to match the positions from another trace - those are the "gap points"

This comment has been minimized.

Copy link
@etpinard

etpinard Sep 4, 2018

Member

do you agree with this choice?

+1 for me.

This comment has been minimized.

Copy link
@nicolaskruchten
@etpinard
Copy link
Member

left a comment

Great PR!

I hope implementing those per-trace stack* attributes wasn't too much of a headache. The two stackgaps modes are looking great. 📈

Most of my comments are simply comments, with the exception of:

  • I don't think we need that alwaysSupplyDefaults trace module category
  • mutating gd.calcdata[i][j].i isn't great.
  • is that hacky fill default logic really necessary?
  • what do you think adding a 'stack' flag to scatter mode
Show resolved Hide resolved src/plots/cartesian/autorange.js
Show resolved Hide resolved test/image/mocks/log_lines_fills.json
Show resolved Hide resolved src/lib/search.js
Show resolved Hide resolved src/plots/plots.js
Show resolved Hide resolved src/traces/bar/layout_attributes.js
var subplotAndType = trace.xaxis + trace.yaxis + trace.type;
var firstScatter = fullLayout._firstScatter;
if(!firstScatter[subplotAndType]) firstScatter[subplotAndType] = trace.uid;
}

This comment has been minimized.

Copy link
@etpinard

etpinard Sep 4, 2018

Member

Nice. This is probably the most underrated piece in this PR. I always found that gd.firstscatter less-than-ideal. This is a welcome improvement.

Show resolved Hide resolved src/traces/scatter/calc.js
Show resolved Hide resolved src/traces/scatter/cross_trace_calc.js
// if we're stacking, "infer zero" gap mode gets markers in the
// gap points - because we've inferred a zero there - but other
// modes (currently "interpolate", later "interrupt" hopefully)
// we don't draw generated markers

This comment has been minimized.

Copy link
@etpinard

etpinard Sep 4, 2018

Member

do you agree with this choice?

+1 for me.

Show resolved Hide resolved src/traces/scatter/stack_defaults.js
@nicolaskruchten

This comment has been minimized.

Copy link
Member

commented Sep 5, 2018

So from a high-level API standpoint, why do we want stackgroup again? Is it just so as to match a potential future equivalent for bar? Because as a standalone API it's kind of ungainly, and I can't imagine a use-case for have some areas stacked and some not in the same plot... ?

@alexcjohnson

This comment has been minimized.

Copy link
Contributor Author

commented Sep 5, 2018

So from a high-level API standpoint, why do we want stackgroup again? Is it just so as to match a potential future equivalent for bar? Because as a standalone API it's kind of ungainly, and I can't imagine a use-case for have some areas stacked and some not in the same plot... ?

It's unusual for sure, but I wouldn't want to rule it out. What if you have one stack series for data and another for fits? Then one stack would need its fill removed, since they're overlapping. Or prediction/extrapolation - these might not overlap but still you might want different styling for corresponding items in each stack. Or two back-to-back stacks, like those plots that have male on one side and female on the other, with the axis in the middle (we could manage that one with an analog of barmode: 'relative', or perhaps even better two axes with a constraint so you don't need to flip your data... but you see the point)

What do you think about adding a 'stack' flag to scatter mode to make it easier to toggle stacked areas on and off?

mode is otherwise all about how to draw the series, not where to draw it... and the one bit of how that stacking impacts (fill) isn't even part of mode.

But, perhaps both of these concerns could be assuaged by making a boolean stack attribute, then giving stackgroup a default value but only coercing it when stack is true? (and for completeness, if you provide only a stackgroup let stack default to true). That way the usual behavior would be to just use the boolean but the full flexibility would still be available (if perhaps buried in the UI)

@nicolaskruchten

This comment has been minimized.

Copy link
Member

commented Sep 5, 2018

OK, I'll buy the "back to back stacks" argument :)

Could we make sure the documentation clearly explains whether or not stack normalization applies across or within subplots please? I don't know the answer but I'd like to and I think we should canonicalize it in the docs!

@nicolaskruchten

This comment has been minimized.

Copy link
Member

commented Sep 5, 2018

I think we can live without an extra stack attribute personally :)

@alexcjohnson

This comment has been minimized.

Copy link
Contributor Author

commented Sep 5, 2018

Could we make sure the documentation clearly explains whether or not stack normalization applies across or within subplots please? I don't know the answer but I'd like to and I think we should canonicalize it in the docs!

Within subplots (and within stack groups, if there are multiple on one subplot) -> 00d7d22

@etpinard etpinard added this to the v1.41.0 milestone Sep 6, 2018

@etpinard

This comment has been minimized.

Copy link
Member

commented Sep 6, 2018

Down to 2️⃣ unresolved comments:

@alexcjohnson alexcjohnson referenced this pull request Sep 7, 2018

Open

v2.0.0 wishlist #420

0 of 15 tasks complete
@etpinard

This comment has been minimized.

Copy link
Member

commented Sep 7, 2018

Nicely done 💃

@alexcjohnson alexcjohnson merged commit 70fcaad into master Sep 7, 2018

6 checks passed

ci/circleci: build Your tests passed on CircleCI!
Details
ci/circleci: test-image Your tests passed on CircleCI!
Details
ci/circleci: test-image2 Your tests passed on CircleCI!
Details
ci/circleci: test-jasmine Your tests passed on CircleCI!
Details
ci/circleci: test-jasmine2 Your tests passed on CircleCI!
Details
ci/circleci: test-syntax Your tests passed on CircleCI!
Details

@alexcjohnson alexcjohnson deleted the stacked-area branch Sep 7, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.