Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

density contours stroke & fill #948

Merged
merged 19 commits into from
Jun 24, 2022

Conversation

Fil
Copy link
Contributor

@Fil Fil commented Jun 22, 2022

As a complement to #943:

  • group by z, stroke, fill
  • apply other styles as well (constant like mixBlendMode, channel-based like title…)
  • use fill: "density" or stroke: "density" to color by "density level" (note that the numbers are not interpretable by the reader, so the legend is not recommended—or maybe we need to normalize are normalized by the maximum density across all facets and series)

todo:

@Fil Fil requested a review from mbostock June 22, 2022 15:06
@Fil Fil mentioned this pull request Jun 22, 2022
1 task
src/marks/density.js Outdated Show resolved Hide resolved
@Fil
Copy link
Contributor Author

Fil commented Jun 23, 2022

I've tested the initializer composition with the bin and the hexbin transforms… I don't think we want to add them to the unit tests because they don't have value as dataviz, but it's fun

Capture d’écran 2022-06-23 à 17 17 39
Capture d’écran 2022-06-23 à 17 17 50

src/marks/density.js Outdated Show resolved Hide resolved
@Fil Fil changed the base branch from mbostock/density-contours to main June 24, 2022 08:56
@Fil
Copy link
Contributor Author

Fil commented Jun 24, 2022

rebased on main

@Fil
Copy link
Contributor Author

Fil commented Jun 24, 2022

I want to have 1-d KDE, but it's not in the scope of this PR, which in my view is complete.

.bandwidth(bandwidth)
.thresholds(thresholds);

// First pass: seek the maximum density across all facets and series; memoize for performance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“Memoizing” refers to a caching technique where, given a function that computes a consistent return value for its given arguments, the return values are cached by their arguments so as to avoid invoking the function repeatedly for the same arguments. This isn’t really memoizing here; there’s not a function that represents the computational task. I haven’t quite followed what this code is doing yet but this is probably something like “cache the initial set of contours to avoid recomputing them in the second pass”. I need to read this code more thoroughly (I’m making an edit pass now) but I’ll add more comments here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, cache the contours and reuse those that ended up with the same threshold selection.

@mbostock
Copy link
Member

In the future, please don’t delete my PR before I’ve had a chance to review your contributions. I understand the desire to rebase to keep things up-to-date, but it’s now harder for me to review this code because I can’t easily see the changes you made on top of the previous PR I put up. Thank you. 🙏

@mbostock mbostock changed the base branch from main to mbostock/density-contours June 24, 2022 15:48
@mbostock
Copy link
Member

Okay, I was able to rebase my PR to main, then this PR to my PR, and we’re back to me being able to review all the changes here. 😅 Thanks for being patient with me.

.thresholds(thresholds)
(index))
.call(g => g.selectAll()
.data(Array.from(index, i => [i]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use applyChannelStyles instead of applyGroupedChannelStyles, we won’t need to do this. (I’ll make this change shortly.)

const W = channels.weight?.value;
const Z = channels.z?.value;
const {z} = this;
const [cx, cy] = applyFrameAnchor(options, dimensions);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

applyFrameAnchor takes this as the first argument, not options. (I’ll fix this.)

@mbostock
Copy link
Member

We might also want to support fillOpacity = density and strokeOpacity = density? Could be left as a future improvement though; definitely not urgent. (I’ll consider implementing this now.)

Comment on lines 117 to 120
// normalize colors to a thresholds scale
const m = max * (maxn + 1) / maxn;
if (f) channels.fill.value = channels.fill.value.map(v => v / m);
if (s) channels.stroke.value = channels.stroke.value.map(v => v / m);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m thinking maybe we don’t do this, and just leave the units as “points per square pixel” or “value per square pixel” or perhaps “points per 1,000 square pixels” as a more human-readable value. Even though the value is arbitrary and resolution-dependent, it’s still probably more useful than 0 = 0, 1 = maximum observed density because it allows you to have a consistent scale across charts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it depends if you want to compare distributions or actual quantities, but I'm OK with reverting this. If needed, we could add the normalization as an option, but I'm not convinced it's that useful—I was just trying to make sense of the values.

100 square pixels is probably a better choice for a unit than 1000, as one can picture a 10x10 square hosting 1 or a few points.

@Fil
Copy link
Contributor Author

Fil commented Jun 24, 2022

I'm not sure about fillOpacity being useful, since the shapes are not exclusive bands the areas are overlaid and having a constant fillOpacity=0.1 is what results in a "transparent to opaque" display. (But not opposed to adding it for consistency.)

// Second pass: generate contours with the thresholds derived above
density.thresholds(thresholds);
for (const {facetIndex, index, c: memoc, top} of memo) {
const c = top < max ? density(index) : memoc;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure how much this caching is helping; I think most commonly, only one series (or facet-series) will be cacheable as having the max threshold. All the others will need to be recomputed. I guess that’s still helpful in the common case of only a single series and single facet, but I suspect we could make this significantly faster if d3.contourDensity had a way to compute and store the blurred grid, and then compute individual contours for a particular threshold afterward. That way, we wouldn’t have to recompute the underlying grid when computing new contours, and we could cache individual contours rather than sets.

Copy link
Member

@mbostock mbostock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some edits in abfd86e. To summarize:

  • For options that are only needed by the initializer, I moved the defaults and coercion into the initializer.
  • I fixed a bug where if the fill was specified as “density”, the default stroke is now “none” (not “currentColor”).
  • I removed this.path in favor of creating a path in render.
  • I added a comment explaining why mark.filter is overridden. I wonder if we also want to change how the channels are grouped: rather than picking the value associated with the first index for each facet-series, we should pick the first defined value? But this is probably fine as-is.
  • I adopted applyChannelStyles instead of applyGroupedChannelStyles to avoid constructing wrapper index arrays for each contour.
  • I added z to the set of channels that are not automatically grouped (since z is not needed to render the computed contours), and adopted a Set (since the explicit enumeration of channel names in the code was feeling a bit unwieldy with four named channels).
  • I adopted the coerceNumbers and identity technique used by the hexbin transform for the x, y, and weight channels.
  • I fixed the first argument to applyFrameAnchor.
  • I tweaked how the newChannels are constructed: it’s an object rather than an array, and I handle the fill and stroke = “density” cases separately.
  • I removed the use of constant(cx) and constant(cy), since d3-contour does that for us.
  • I removed then normalization of the color scale values (in the color = density case) and instead scaled by a fix factor of 100. Mostly because this is simpler and faster and I’m not really sure what we should do here.
  • I removed the caching logic in favor of skipping the threshold-finding pass when there’s only one facet-series. If we want to make this faster, we should instead consider making changes to the d3-contour API to avoid recomputing the blurred grid.
  • I forced the color scale to include zero in the color = density case, rather than the default which is that the color range will start at the smallest contour value.

@mbostock mbostock merged commit 563ce90 into mbostock/density-contours Jun 24, 2022
@mbostock mbostock deleted the fil/density-contours branch June 24, 2022 20:38
mbostock added a commit that referenced this pull request Jun 24, 2022
* density mark

* density contours stroke & fill (#948)

* density contours as an initializer: first pass

* density by z, carrying styles and channels (no reducer, only "first")

* fill: "density"

* density weight

* consistent thresholds across facets and series

note: we have to go around this bug in d3-contour: d3/d3-contour#57

* allow initializer composition; move initializer to the class; clean up

* density weight example

* error if x or y is undefined

(in the future we might route those to 1-dimensional transforms—KDE)

* document

* reduce img

* * don't apply the scales if we are already in pixel space (e.g. when composing with the hexbin transform)
* avoids a crash when there is no contour

* 1d density contours with frameAnchor

* adopt d3@7.4.5 for https://github.com/d3/d3-contour/releases/tag/v3.0.2

(not entirely sure if it's good practice to have all the yarn.lock changes in the PR, beyond those that are relevant)

* replace example image

* mike’s edits

* fix image dimensions

* isDensity

* distinct

* tweak tests

Co-authored-by: Mike Bostock <mbostock@gmail.com>

Co-authored-by: Philippe Rivière <fil@rezo.net>
@Fil
Copy link
Contributor Author

Fil commented Jun 24, 2022

Superb. Thank you for the detailed list of changes!

@Fil Fil mentioned this pull request Apr 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants