Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto: can it be made to make a timeseries bar chart? #1800

Closed
tophtucker opened this issue Aug 10, 2023 · 5 comments · Fixed by #1801
Closed

Auto: can it be made to make a timeseries bar chart? #1800

tophtucker opened this issue Aug 10, 2023 · 5 comments · Fixed by #1801
Labels
enhancement New feature or request

Comments

@tophtucker
Copy link
Contributor

tophtucker commented Aug 10, 2023

Currently, you can only make a bar chart with a temporal/quantitative independent axis if the dependent axis has a reducer specified (implicitly or explicitly). But you cannot make a temporal/quantitative bar chart with a simple value as the dependent encoding.

  • Plot.auto(olympians, {x: "date_of_birth"}).plot() → implicit count reducer on y, hence temporal bar chart
  • Plot.auto(aapl, {x: "Date", y: "Volume", mark: "bar"}).plot() → surprising sparse matrix of cells

I know the bar-interval thing is a perennial headache lol, and others have raised it, but I’m feeling the pain right now and feeling motivated to come up with something!

E.g. for Plot.auto(aapl, {x: "Date", y: "Volume"}).plot(), we get a line, but it is inappropriate to interpolate values between successive days, so (hairline) bars would be better.

I may have made this worse in #1674, which changed it to choose cell over rect if there's no reducer, which might've been too aggressive and prevents us from taking advantage of the rect's implicit y1 = 0. (But I'm sleepy and not sure it ever would've worked; can check later.) (Nope, see below.)

Might be improved by #1790.

Example notebook:

image
@tophtucker tophtucker added the enhancement New feature or request label Aug 10, 2023
@tophtucker
Copy link
Contributor Author

Ok well at least it isn’t a regression. I was worried I’d made it worse, but no, it’s just the age-old ordinal interval problem; at 0.6.8, one commit before #1674, nothing shows at all:

const data = [
    {date: new Date("2023-04-01"), type: "triangle", value: 5},
    {date: new Date("2023-04-05"), type: "circle", value: 7},
    {date: new Date("2023-04-10"), type: "circle", value: 8},
    {date: new Date("2023-04-15"), type: "circle", value: 3},
    {date: new Date("2023-04-15"), type: "triangle", value: 7},
    {date: new Date("2023-04-20"), type: "triangle", value: 4},
    {date: new Date("2023-04-25"), type: "square", value: 5}
];
return Plot.auto(data, {x: "date", y: "value", color: "type", mark: "bar"}).plot();
image

@mbostock
Copy link
Member

Imagine that instead of value you have temperature. Would you expect that mark: "bar" implies stacking and a zero baseline, too? Maybe it’s still better than treating value as ordinal… 🤔

@mbostock
Copy link
Member

This also seems like a bug:

untitled (62)

Plot.auto(data, {x: "date", y: {value: "value", zero: true}, color: "type", mark: "bar"}).plot()

I would expect this to produce the barY also.

I think the real problem is that x is not ordinal (it’s dates)… and we’re hesitant to treat temporal data as ordinal because there could be gaps, and we can’t easily infer the interval. But perhaps it’s better to treat x as ordinal when you explicitly specify the bar mark, and certainly that should be better than showing nothing?

@mbostock
Copy link
Member

Another possibility is that we default to the sum reduce in this case, somehow (because x is temporal)?

untitled (63)

Plot.auto(data, {x: "date", y: {value: "value", reduce: "sum"}, color: "type", mark: "bar"}).plot()

@tophtucker
Copy link
Contributor Author

tophtucker commented Aug 10, 2023

Imagine that instead of value you have temperature. Would you expect that mark: "bar" implies stacking and a zero baseline, too?

Yes, I think so. Bar implies meaningful zero, which implies stackability. At least, I can’t think of a counterexample. (I.e., bars for temperature already feels wrong.) Sometimes people use bars because they don’t want a line’s interpolation between points, but if it’s not zeroful/additive it should probably be ticks.

I think the real problem is that x is not ordinal (it’s dates)… and we’re hesitant to treat temporal data as ordinal because there could be gaps, and we can’t easily infer the interval. But perhaps it’s better to treat x as ordinal when you explicitly specify the bar mark, and certainly that should be better than showing nothing?

Yeah, I wonder if this is the “get off your gg high horse and do what I obviously mean!” approach. I think #1790 addresses the main thing people would complain about… leaving the things they should be complaining about. But sometimes even the irregularity is a feature, e.g. if you wanna show stock market trading days without gaps for the weekends.

Another possibility is that we default to the sum reduce in this case, somehow (because x is temporal)?

Hm! That’s a nice idea. Defaulting to sum would work nicely with autoSpec, showing in the UI and letting you opt out if needed. And it’d work well if (when) we add interval controls in the future. It’d be maddening if you just wanted one bar per data point, but that’s already an issue.

If y were temperature, the sum reducer would be inappropriate; a non-additive reducer like mean or last would be better. But, again, I guess selecting “bar” already implies additivity?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants