Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic binning when a facet dimension is quantitative? #14

Open
mbostock opened this issue Nov 2, 2020 · 6 comments
Open

Automatic binning when a facet dimension is quantitative? #14

mbostock opened this issue Nov 2, 2020 · 6 comments
Labels
enhancement New feature or request

Comments

@mbostock
Copy link
Member

mbostock commented Nov 2, 2020

It’d be neat if you could use a quantitative dimension for faceting, and we automatically binned it (say using d3.bin) into a reasonable number of facets.

@mbostock mbostock added the enhancement New feature or request label Nov 3, 2020
@Fil
Copy link
Contributor

Fil commented Nov 24, 2020

Borrowing from cartography we would want to use Jenks natural breaks or k-means, not only quantize. Seems particularly relevant for faceting, to avoid creating spurious (almost empty) facets. E.g. if the dimension has 3 modes we want those modes as the facets.

This would be done, I guess, by specifying the thresholds (or threshold generator) to d3.bin.

For a relevant example, I combined ac93f58 with simple-statistics' ckmeans method to cluster countries by GDP per cap:
Capture d’écran 2020-11-24 à 10 01 51

These 4 clusters would be my facets.

@mbostock
Copy link
Member Author

The default thresholds using d3.ticks have the nice property that the axis documents the threshold values. I wonder if you specify alternative thresholds if there would be a convenient way to use those threshold values as ticks also — it’s hard to tell in the screenshot above exactly where the thresholds are. Though, I suppose exactness is not essential and they’re probably not nice round values anyway.

@Fil
Copy link
Contributor

Fil commented Nov 24, 2020

The https://observablehq.com/d/e87ba37a7b86bb94#ckMeansNiceThresholds function returns "not so ugly" thresholds, I suppose we could use them as ticks: for example : [14500, 38000, 80000].

@mbostock
Copy link
Member Author

Adding ticks: breaks to the x-axis definition works well if you’re passing in explicit thresholds.

@mbostock
Copy link
Member Author

mbostock commented Feb 24, 2021

@mbostock
Copy link
Member Author

The interval scale option is a great workaround for this issue. It’s not automatic since the interval isn’t computed automatically, but it makes it very easy to bin while faceting. For example:

Screenshot 2023-04-23 at 1 50 27 PM

Plot.plot({
  fy: {
    grid: true,
    tickFormat: ".1f",
    interval: 0.1,
    reverse: true
  },
  marks: [
    Plot.boxX(olympians.filter((d) => d.height), {x: "weight", fy: "height"})
  ]
})

Fil added a commit that referenced this issue May 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants