Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto mark #1238

Merged
merged 22 commits into from
Feb 3, 2023
Merged

Auto mark #1238

merged 22 commits into from
Feb 3, 2023

Conversation

tophtucker
Copy link
Contributor

@tophtucker tophtucker commented Jan 26, 2023

Adds Plot.auto, a shapeshifter mark that assumes the form of a bar, rect, line, area, or dot, depending on inferences from the data it’s passed. That is, you start from an intention about which dimensions of data you want to see, not what mark you want to see them in. In order to get you to a sensible chart even faster, it's even more opinionated than other marks.

E.g., Plot.auto(aapl, {x: "Volume"}).plot() gives you a histogram, implicitly binning on x and setting y to the count reducer (corresponding to Plot.rectY(aapl, Plot.binX({y: "count"}, {x: "Volume"})).plot()).

Plot.auto(data, options)

The options are six "channels" and a mark override:

  • x: corresponds to each mark's x channel; if specified without y, bins or groups on x and shows the count on y
  • y: corresponds to each mark's y channel; if specified without x, bins or groups on y and shows the count on x
  • fx: corresponds to each mark's fx channel
  • fy: corresponds to each mark's fy channel
  • color: corresponds to stroke (for line, rule, dot) or fill (for area, rect, bar, cell)
  • size: corresponds to the dot's r channel; setting this always results in a dot mark
  • mark: dot, line, area, rule, or bar; each option except dot includes x and y variants, and bar tries picks the appropriate mark from barX, barY, rectX, rectY, rect, or cell

The six "channels" take one of the following:

  • a string, which can be a field name or a reducer name
  • an accessor function
  • an object {value: field, reduce: reducer}, where field and reducer are strings or functions

Setting a reducer on x or y implicitly groups or bins on the other (y or x).

  • Add unit tests for errors
  • Add more description error for Plot.auto(aapl).plot() than "mark is not a function"
  • Pass through other channel options for e.g. bin thresholds
  • How to specify a constant color, either color: {constant: "red"} or maybeColorChannel?
  • How to specify a z for multi-series line or area marks?
  • How to limit and sort bar charts?
  • Provide a default sizeReduce for ordinal/ordinal case?
  • Document "magical" behavior, and how it may change in a future version!

@Fil
Copy link
Contributor

Fil commented Jan 27, 2023

I think this will lead us to add a new color option to all marks (where dot would have stroke default to color, and bar or rect would have fill default to color). Similar story for size and r, maybe.

I would also like Plot.auto(data, options) to be somehow guiding the user who has maybe forgotten to write .plot()?

(None of this is blocking.)

Copy link
Member

@mbostock mbostock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are a few style nits on the code—because it’s easiest to comment on the color of the bike shed. One big-ish thing is whether we can avoid duplicating the arrayIsPrimitive logic/needing type inference to check whether the data is an array of primitives.

More importantly, it’d be helpful to recap whether there are significant outstanding tasks here? The list of the TODO’s in the code is different from the PR description, and in particular I wonder about these tasks:

  • How to specify a constant color, either color: {constant: "red"} or maybeColorChannel?
  • How to specify a z for multi-series line or area marks?
  • How to limit and sort bar charts?
  • Provide a default sizeReduce for ordinal/ordinal case?

The first, at least, feels like a blocker. And maybe the second and third…

All that said, I’m so excited for this even more high-level API!! 🥳

src/marks/auto.js Outdated Show resolved Hide resolved
src/marks/auto.js Outdated Show resolved Hide resolved
src/marks/auto.js Outdated Show resolved Hide resolved
src/marks/auto.js Outdated Show resolved Hide resolved
src/marks/auto.js Outdated Show resolved Hide resolved
src/marks/auto.js Outdated Show resolved Hide resolved
return !isOrdinal(values);
}

// TODO What about sorted within series?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. If we passed (materialized) z here, we could group by z and then check that every group is monotonic. But I think it’s okay for this to be a nice-to-have that we do as a followup.

src/marks/auto.js Outdated Show resolved Hide resolved
src/marks/auto.js Outdated Show resolved Hide resolved
@Fil
Copy link
Contributor

Fil commented Jan 30, 2023

A challenge (also non-blocking) is to see how Plot.auto might impact the way people interact with the API. In particular, we want to make sure users will not rely on a specific outcome of plot.auto (for example if they use it to build a new system that has Plot.auto marks “deep inside”). We don't want our hands tied by the initial version. As Mike remarks, it might be enough to document visibly that it will not be considered a breaking change when the heuristic evolves over time to account for more (and refined) use cases.

@tophtucker
Copy link
Contributor Author

I think this will lead us to add a new color option to all marks (where dot would have stroke default to color, and bar or rect would have fill default to color). Similar story for size and r, maybe.

In my testing, the alias "color" has felt pretty good for making it easier to maintain sensible encoding across marks. But I'm not sure about the alias "size" for "r". From showing a couple people, it feels like the name "size" is deceptively generic; they expect it to somehow affect the size of any mark (without knowing exactly what that would mean), when it fact it always makes a dot.

I would also like Plot.auto(data, options) to be somehow guiding the user who has maybe forgotten to write .plot()?

I think the inspector for any mark could be more helpful, but that feels like a separate issue. (Where's your nice notebook where anything can expose a ".toInspect" method that draws a custom inspector? 😝)

we want to make sure users will not rely on a specific outcome of plot.auto

It'd help if we had the Plot.auto(…).source() method we've discussed to eject to a fixed mark.

@Fil
Copy link
Contributor

Fil commented Jan 30, 2023

it's here https://observablehq.com/@fil/object-toview

@mbostock
Copy link
Member

It'd help if we had the Plot.auto(…).source() method we've discussed to eject to a fixed mark.

I feel this is out-of-scope for Plot; Plot shouldn’t be responsible for code generation.

@mbostock
Copy link
Member

mbostock commented Jan 31, 2023

But I'm not sure about the alias "size" for "r".

Agree. If “size” is also an alias for “length” depending on the mark type (namely vector) then it could be useful to have this indirection.

src/marks/auto.js Outdated Show resolved Hide resolved
src/style.js Outdated Show resolved Hide resolved
Copy link
Member

@mbostock mbostock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs documentation, and will need to be merged with the axis mark changes now in main (meaning all the test snapshots will need to be regenerated), but this looks great! We can followup with some of the anticipated improvements in future! 🎉


export function auto(data, {x, y, fx, fy, color, size, mark} = {}) {
// Shorthand: array of primitives should result in a histogram
if (x === undefined && y === undefined) x = identity;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last blocker, IMO: I think this line for the "shorthand" is now too permissive. We probably want to check that the first element is a primitive, or else throw an error if no options are specified? Currently Plot.auto(aapl).plot() looks like this:

image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we drop this shorthand entirely for now. Maybe Plot.auto returns null or throws an error? I dunno.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah probably. If I just remove the line it looks like this:

image

Which I think looks broken. So I added an error:

image

@mbostock
Copy link
Member

mbostock commented Feb 3, 2023

Filed #1252 to track auto mark fast follows.

@mbostock mbostock merged commit cc1824b into main Feb 3, 2023
@mbostock mbostock deleted the toph/auto branch February 3, 2023 15:56
chaichontat pushed a commit to chaichontat/plot that referenced this pull request Jan 14, 2024
* Auto mark, working, with one test

* more tests

* pReTtIeR

* even pRetTIer

* Nullish checks on sizeValue and sizeReduce

Co-authored-by: Mike Bostock <mbostock@gmail.com>

* Facet frame style matches default grid style; more concise transform options

* isHighCardinality helper

* More permissive shorthand

* move isHighCardinality; adopt identity

* rewrap a few comments

* constant-color disambiguation

* remove completed TODO

* auto zero; auto z-order for decos

* sort exports

* Fixing marks import; regenerating tests

* Default size reduce

* Rm todo

* Remove shorthand for now :(

* rm unused import

* error on null x & y, too

* fix facet axis labels

* require opposite channel when reducing

---------

Co-authored-by: Mike Bostock <mbostock@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants