Skip to content

Categorical colormapping of 3D arrays #12356

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Jan 29, 2023

Conversation

ianthomas23
Copy link
Member

@ianthomas23 ianthomas23 commented Sep 7, 2022

This is an draft PR to add categorical colormapping to Bokeh. It is essentially a reimplementation of Datashaders _colorize function so that all of the data is at the Bokeh level and available for colormapping and inspection. It definitely targets after the 3.0 release and is here as a starting point for discussions about functionality and implementation.

Example:

from bokeh.layouts import row
from bokeh.models import ColorBar, EqHistColorMapper, WeightedStackColorMapper
from bokeh.palettes import varying_alpha_palette
from bokeh.plotting import figure, show
import numpy as np

colors = ["red", "green", "blue"]
min_alpha = 40

def categorical_data():
    n = 20
    samples = 1000
    rng = np.random.default_rng(92478)

    centers = [(0.3, 0.3), (0.7, 0.5), (0.3, 0.7)]
    radii = [0.2, 0.3, 0.15]
    ncats = len(radii)

    dx = 1.0/n
    data = np.zeros((n, n, ncats), dtype=np.float32)
    data[:] = np.nan

    for k in range(ncats):
        x = rng.normal(centers[k][0], radii[k], samples)
        y = rng.normal(centers[k][1], radii[k], samples)
        i = (x / dx).astype(int)
        j = (y / dx).astype(int)
        for ii, jj in zip(i, j):
            if 0 <= ii < n and 0 <= jj < n:
                if np.isnan(data[jj, ii, k]):
                    data[jj, ii, k] = 1.0
                else:
                    data[jj, ii, k] += 1.0
    return data

data = categorical_data()
p = figure(width=500, height=400)

alpha_mapper = EqHistColorMapper(palette=varying_alpha_palette(color="#000", n=10, start_alpha=min_alpha))
color_mapper = WeightedStackColorMapper(palette=colors, nan_color=(0, 0, 0, 0), alpha_mapper=alpha_mapper)#, color_baseline=-10)
p.image_stack(image=[data], x=0, y=0, dw=1, dh=1, color_mapper=color_mapper)

color_bar = ColorBar(color_mapper=color_mapper)
p.add_layout(color_bar, "right")

show(p)

which gives
Screenshot 2022-09-07 at 18-02-12 Bokeh Plot

This uses a new ImageStack class, passing a 3D array of dims (ny, nx, ncat) where ncat is the number of categories. Each category has a color that is passed to the WeightedStackColorMapper that aggregates the 3D data array down to a 2D RGBA array. The colors are combined using a (sort of) weighted sum across the categories for each array element, and the alpha value is determined using a separate alpha_mapper which here is an EqHistColorMapper but could be linear or log or whatever. You can control the min_alpha via the alpha_mapper, and the color_baseline via the StackColorMapper. RGBA outputs are very similar to Datashader but not identical; I need to identify if these are just rounding errors or some fundamental difference.

The colorbar shows the EqHist levels using a color from the varying_alpha_palette that is only used for this purpose.

Some of the design and implementation are a little crude and will need discussion and improvement.

This will only be the start of work in this direction. The longer term plan is to support multiple ways of combining the categorical data (within BokehJS or via JS callbacks) and significantly richer colorbar/legend/labelling to support more complicated situations.

There are no tests at all yet, I have been comparing against Datashader so far.

(Edited to follow updated code).

@codecov
Copy link

codecov bot commented Sep 7, 2022

Codecov Report

Merging #12356 (d98de08) into branch-3.1 (f63969a) will increase coverage by 0.00%.
The diff coverage is 94.73%.

@@             Coverage Diff             @@
##           branch-3.1   #12356   +/-   ##
===========================================
  Coverage       92.23%   92.24%           
===========================================
  Files             314      314           
  Lines           19682    19702   +20     
===========================================
+ Hits            18154    18174   +20     
  Misses           1528     1528           

@mattpap mattpap added this to the 3.1 milestone Oct 29, 2022
@mattpap mattpap marked this pull request as ready for review October 29, 2022 22:18
@mattpap mattpap changed the base branch from branch-3.0 to branch-3.1 November 2, 2022 18:41
@ianthomas23 ianthomas23 force-pushed the ianthomas23/categorical_colormapping branch from e1d2280 to 8172717 Compare November 18, 2022 16:53
@ianthomas23 ianthomas23 force-pushed the ianthomas23/categorical_colormapping branch 2 times, most recently from 496c54a to 764945f Compare January 20, 2023 17:09
@mattpap
Copy link
Contributor

mattpap commented Jan 20, 2023

Some of the CI failures are unrelated and will be fixed in PR #12631.

@ianthomas23
Copy link
Member Author

Some of the CI failures are unrelated and will be fixed in PR #12631.

Thanks. I will finish this off on Monday (hopefully) then it will be ready for discussion and review.

@ianthomas23
Copy link
Member Author

ianthomas23 commented Jan 23, 2023

Comparison of the output here using Bokeh vs using Datashader's shade function:

Screenshot 2023-01-23 at 11 48 48

Code to reproduce:

from bokeh.layouts import row
from bokeh.models import ColorBar, EqHistColorMapper, WeightedStackColorMapper
from bokeh.palettes import varying_alpha_palette
from bokeh.plotting import figure, row, show
import datashader.transfer_functions as tf
import numpy as np
import xarray as xr

n = 20
colors = ["red", "green", "blue"]
min_alpha = 40
want_floats = False

def categorical_data():
    samples = 1000
    rng = np.random.default_rng(92478)

    centers = [(0.3, 0.3), (0.7, 0.5), (0.3, 0.7)]
    radii = [0.2, 0.3, 0.15]
    ncats = len(radii)

    dx = 1.0/n
    if want_floats:
        data = np.full((n, n, ncats), np.nan, dtype=np.float32)
    else:
        data = np.zeros((n, n, ncats), dtype=np.uint32)

    for k in range(ncats):
        x = rng.normal(centers[k][0], radii[k], samples)
        y = rng.normal(centers[k][1], radii[k], samples)
        i = (x / dx).astype(int)
        j = (y / dx).astype(int)
        for ii, jj in zip(i, j):
            if 0 <= ii < n and 0 <= jj < n:
                if np.isnan(data[jj, ii, k]):
                    data[jj, ii, k] = 1
                else:
                    data[jj, ii, k] += 1
    return data

data = categorical_data()

ps = []
for i in range(2):
    title = "Bokeh" if i == 0 else "Datashader"
    kwargs = dict(width=420, height=350, title=f"Colormapping in {title}")
    if i > 0:
        kwargs["x_range"] = ps[0].x_range
        kwargs["y_range"] = ps[0].y_range
        kwargs["width"] = 350
    p = figure(**kwargs)
    ps.append(p)

    if i == 0:  # Bokeh colormapping
        if not want_floats:  # Convert unwanted data to nan.
            data = data.astype(np.float64)
            data[data == 0] = np.nan
        alpha_mapper = EqHistColorMapper(palette=varying_alpha_palette(color="#000", n=10, start_alpha=min_alpha))
        color_mapper = WeightedStackColorMapper(palette=colors, nan_color=(0, 0, 0, 0), alpha_mapper=alpha_mapper)#, color_baseline=-10)
        p.image_stack(image=[data], x=0, y=0, dw=1, dh=1, color_mapper=color_mapper)

        color_bar = ColorBar(color_mapper=color_mapper)
        p.add_layout(color_bar, "right")
    else:  # Datashader colormapping
        coords = np.arange(n, dtype=np.float64)
        cat = ["A", "B", "C"]  # <U1 "F" "M"
        da = xr.DataArray(data=data, dims=["y", "x", "cat"],
            coords=dict(x=coords, y=coords, cat=["A", "B", "C"]))
        im = tf.shade(da, color_key=colors, min_alpha=min_alpha).to_numpy()
        p.image_rgba(image=[im], x=0, y=0, dw=1, dh=1)

show(row(ps))

@ianthomas23 ianthomas23 force-pushed the ianthomas23/categorical_colormapping branch from 764945f to e77f22d Compare January 23, 2023 12:54
@ianthomas23
Copy link
Member Author

This is now ready, there are a couple of things to highlight/discuss.

  1. This reuses the Image glyph. You can pass in 2D arrays as before and colormap using LinearColorMapper or similar, or pass in 3D arrays and colormap using StackColorMapper. Validation of the 2D/3D arrays and their color mappers occurs in BokehJS but not Python.

  2. The new colormapper class is called StackColorMapper. I'm not really happy about the name but I cannot think of a better one. The Stack part of the name comes from the 3D array really being a stack of 2D arrays that are combined to be rendered as a 2D image. In datashader we would probably call it a CategoricalColorMapper but that is too specific for the Bokeh use case. It could be a FlatteningColorMapper.

@@ -40,14 +40,22 @@ export class ColorBarView extends BaseColorBarView {
this.connect(this.model.properties.display_high.change, () => this._metrics_changed())
}

get color_mapper(): ColorMapper {
// Color mapper that is used to render this colorbar.
Copy link
Member Author

@ianthomas23 ianthomas23 Jan 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this PR, all ColorMapper classes render themselves. Now a WeightedStackColorMapper renders its alpha_mapper. This function could be replaced by a virtual function in ColorMapper that is overridden in WeightedStackColorMapper, but the instanceof checks are quite common in the color mapper classes so I have followed that approach.

@@ -25,9 +25,9 @@ export class ImageView extends ImageBaseView {
}
}

protected _flat_img_to_buf8(img: Arrayable<number>): Uint8ClampedArray {
protected _flat_img_to_buf8(img: Arrayable<number>, length_divisor: number): Uint8ClampedArray {
Copy link
Member Author

@ianthomas23 ianthomas23 Jan 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This length_divisor isn't ideal, but is the easiest way to add the new functionality here. Really it is the length of the third dimension of the supplied array, but as we are dropping the dimensionality of arrays before they get here the information is not available so has to be passed in from lower down the call stack.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to start using ndarrays in such APIs, but that's something for another time, as the existing ndarrays in bokehjs aren't that useful right now.

@@ -80,4 +81,9 @@ export abstract class ColorMapper extends Mapper<Color> {

protected abstract _v_compute<T>(xs: ArrayableOf<uint32 | Factor>,
values: Arrayable<T>, palette: Arrayable<T>, colors: {nan_color: T}): void

protected _v_compute_uint32(xs: ArrayableOf<uint32 | Factor>, values: Arrayable<uint32>,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original _v_compute is generic and is called from a number of different places. This new _v_compute_uint32 is a specialisation of that function for uint32, i.e. colors that have been converted to RGBA bytes. This is only ever called from the rgba_mapper() function.

Copy link
Member Author

@ianthomas23 ianthomas23 Jan 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below the default behaviour is to drop back to the generic function. WeightedStackColorMapper implements its own version of the uint32 function as it directly manipulates the R, G, B and A bytes.

@bryevdv
Copy link
Member

bryevdv commented Jan 23, 2023

This reuses the Image glyph. You can pass in 2D arrays as before and colormap using LinearColorMapper or similar, or pass in 3D arrays and colormap using StackColorMapper. Validation of the 2D/3D arrays and their color mappers occurs in BokehJS but not Python.

This is the part I am concerned about. Having a situation where users have to carefully coordinate multiple parameter types because various combinations don't work at all together is inviting usage problems.

I guess I'd like to think about adding a new glyph type, that only accepts stacked images and only accepts stacked color mappers. Users won't have to fret over matching the right kind of mapper with the right kind of image, and if they get something wrong it will be a property error at the python level

Alternatively, we could consider adding a validation rule but I do think those are best avoided unless there is no better option to provide meaningful errors at the python level.

@ianthomas23
Copy link
Member Author

I did originally write it as a separate ImageStack glyph but then I rolled it into Image after someone asked if that was possible. I am happy to separate it out again.

@bryevdv
Copy link
Member

bryevdv commented Jan 23, 2023

My preference would be to have a separate ImageStack but interested in hearing any other opinions from @bokeh/core

Question: are there any other operations specific to image stacks, besides stacked colormapping, that we might imagine adding in the future? I can imagine some hover operations or other aggregations that might apply.

@ianthomas23
Copy link
Member Author

Question: are there any other operations specific to image stacks, besides stacked colormapping, that we might imagine adding in the future? I can imagine some hover operations or other aggregations that might apply.

Yes, in the short term I expect specific hover operations and probably some richer colorbar and legend possibilities. Longer term it is possible that this opens the door to a whole suite of image stack operations with a number of different color mappers.

@mattpap mattpap self-requested a review January 24, 2023 23:06
@ianthomas23
Copy link
Member Author

I've moved the new functionality out of Image and put it in a new ImageStack. Image and ImageStack are pretty similar so there is the opportunity for a shared base class here, but we already have an ImageBase and I can't think of a reasonable name for a new one that inherits from ImageBase. This could be something we refactor in the future, and this also applies to the ColorMapper classes if we end up extending that class hierarchy in future which seems likely.

I've also modified the examples above in line with these changes.

@bryevdv
Copy link
Member

bryevdv commented Jan 25, 2023

I really like the updated version @ianthomas23 One last thought: what you think about making StackColorMapper a base class, and renaming the current color mapper in this PR to WeightedStackColorMapper or similar? What comes to mind is that another mode of operations could be to have a standard 2d image color mapping on just one "slice" of the stack, e.g. SlicedStackColorMapper. I could imagine a slider or other method used to scrub through the individual slices by setting the slice index on the colormapper. I think having StackColorMapper be a base class would leave the door open more cleanly for additional color mapping options for stacked images in the future.

@ianthomas23
Copy link
Member Author

@bryevdv Yes, that is a good idea. I cannot think of a better name than WeightedStackColorMapper so let's go with that.

@ianthomas23 ianthomas23 force-pushed the ianthomas23/categorical_colormapping branch from 58ce956 to b138c6f Compare January 26, 2023 12:01
@mattpap
Copy link
Contributor

mattpap commented Jan 27, 2023

bokeh/bokehjs/test/baselines/linux/Tabs__should_allow_tabs_header_location_below_with_active_==_1.png

This shouldn't have been committed. This file is empty in this PR. Restoring the original will fix failing integration tests.

@mattpap mattpap merged commit 81928db into branch-3.1 Jan 29, 2023
@mattpap mattpap deleted the ianthomas23/categorical_colormapping branch January 29, 2023 19:25
Chiemezuo pushed a commit to Chiemezuo/bokeh that referenced this pull request Aug 27, 2024
* Add new ImageStack class

* Add new StackColorMapper class

* Separate functions for ColorMapper mixing of encoded RGBA and Color

* Alpha mapping

* ColorBar for StackColorMapper

* Correct use of alpha_mapper to calculate alpha values

* Correct number of colors

* Support NaN in StackColorMapper

* Add color_baseline argument

* Special case for pixels with total data of zero

* Move ImageStack functionality into Image

* Update typescript

* Fix tests

* Add new unit and integration (visual) tests

* Update docs

* Separate ImageStack glyph

* Use WeightedStackColorMapper and abstract base class

* Rename some tests

* Remove unwanted baseline image change
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 26, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants