Add support for legend_field with geo data #9398

raholler · 2019-11-11T14:01:33Z

ALL software version info (bokeh, python, notebook, OS, browser, any other relevant packages)

bokeh version 1.4.0
python 3.7
Windows 10

Description of expected behavior and the observed behavior

I want to add a legend to my plot of Geodata. In particular, I plot point data with different coloring according to a categorical variable in my data set. I transform my geopandas to to a GeoJsonDataSource accordingly. Everything works well, except creating the legend.

When I follow the following example: https://docs.bokeh.org/en/latest/docs/user_guide/annotations.html#legends
I get the following error:

Column to be grouped does not exist in glyph data source.

Even though I include the source in the glyph method, i.e.

p1.circle(
    "x",
    "y",
    source=geosource,
    fill_color={"field": "share_prot", "transform": color_mapper},
    line_color="black",
    line_alpha=0.5,
    line_width=0.3,
    alpha=0.6,
    #size=2,
    legend_group="share_prot"
)

Complete, minimal, self-contained example code that reproduces the issue

from bokeh.io import output_file, show
from bokeh.models import GeoJSONDataSource
from bokeh.plotting import figure
from bokeh.sampledata.sample_geojson import geojson
import json

data = json.loads(geojson)
for i in range(len(data['features'])):
    data['features'][i]['properties']['Color'] = ['blue', 'red'][i%2]

geo_source = GeoJSONDataSource(geojson=json.dumps(data))
p = figure(background_fill_color="lightgrey")
p.circle(x='x', y='y', size=15, color='Color', alpha=0.7, source=geo_source, legend_group='Color')

show(p)

Stack traceback and/or browser JavaScript console output

<ipython-input-133-afa857e8c002> in <module>
     12 color_mapper = CategoricalColorMapper(factors=share_prot.unique(), palette=palette)
     13 p = figure(background_fill_color="lightgrey")
---> 14 p.circle(x='x', y='y', size=15, color='Color', alpha=0.7, source=geo_source, legend_group='Color')
     15 
     16 show(p)

fakesource in circle(self, x, y, **kwargs)

~\Anaconda3\envs\adv\lib\site-packages\bokeh\plotting\helpers.py in func(self, **kwargs)
    930 
    931         if legend_kwarg:
--> 932             _update_legend(self, legend_kwarg, glyph_renderer)
    933 
    934         self.renderers.append(glyph_renderer)

~\Anaconda3\envs\adv\lib\site-packages\bokeh\plotting\helpers.py in _update_legend(plot, legend_kwarg, glyph_renderer)
    487     kwarg, value = list(legend_kwarg.items())[0]
    488 
--> 489     _LEGEND_KWARG_HANDLERS[kwarg](value, legend, glyph_renderer)
    490 
    491 

~\Anaconda3\envs\adv\lib\site-packages\bokeh\plotting\helpers.py in _handle_legend_group(label, legend, glyph_renderer)
    454         raise ValueError("Cannot use 'legend_group' on a glyph without a data source already configured")
    455     if not (hasattr(source, 'column_names') and label in source.column_names):
--> 456         raise ValueError("Column to be grouped does not exist in glyph data source")
    457 
    458     column = source.data[label]

ValueError: Column to be grouped does not exist in glyph data source

The text was updated successfully, but these errors were encountered:

bryevdv · 2019-11-12T02:40:38Z

@raholler It would be a fair amount of effort to make legend_group, which does does grouping on the Python side, because GeoJSON data source columns are actually not fully realized until things hit the browser.

However, a very minor change allows legend_field, which does grouping in the browser, to function:

I am going to mark this issue as a feature to add support for legend_field, and also to raise a more informative error when legend_group is attempted to be used with a geo source, stating explicitly that that combination is not supported (but that legend_field is).

Would you like to work on this PR, with some guidance?

bryevdv · 2019-11-12T02:43:28Z

For refernece, the change I made to generate the above plot with legend_field was:

diff --git a/bokeh/models/annotations.py b/bokeh/models/annotations.py
index 421fd0bd1..e48bc4e8e 100644
--- a/bokeh/models/annotations.py
+++ b/bokeh/models/annotations.py
@@ -154,14 +154,14 @@ class LegendItem(Model):
             if len({r.data_source for r in self.renderers}) != 1:
                 return str(self)

-    @error(BAD_COLUMN_NAME)
-    def _check_field_label_on_data_source(self):
-        if self.label and 'field' in self.label:
-            if len(self.renderers) < 1:
-                return str(self)
-            source = self.renderers[0].data_source
-            if self.label.get('field') not in source.column_names:
-                return str(self)
+    # @error(BAD_COLUMN_NAME)
+    # def _check_field_label_on_data_source(self):
+    #     if self.label and 'field' in self.label:
+    #         if len(self.renderers) < 1:
+    #             return str(self)
+    #         source = self.renderers[0].data_source
+    #         if self.label.get('field') not in source.column_names:
+    #             return str(self)

A real solution would not comment out the validation check, but either:

make it skip when the source is a GeoJSONDataSource, or
have it more carefully inspect the geo json properties directly

raholler · 2019-11-13T12:10:29Z

I generally would work on it with some guidance, but I am overly busy until December 13th. If after that is fine for you, I can do it (try, am not that experienced in working on packages).

But I think in general, one could rethink the treatment of geodata. Most people that work with geodata in python use geopandas. Maybe a more direct link to geopandas would be useful/easier instead of going through GeoJSON. I saw a related issue, but cannot find it anymore.

bryevdv · 2019-11-13T15:39:54Z

@raholler There's not hurry so happy to work with you on this whenever you are able to look at it.

As for GeoPandas, I think that would be fantastic, but is also an orthogonal concern, I think. It would be appropriate to make a new issue to start a discussion about that.

meenurajapandian · 2020-03-03T05:17:34Z

I came across this while looking for a way to add a legend to a heatmap with a linear colormapper that uses a GeoJSONDataSource. But the legend_field does not work for me either.

The figure is basically this:
p.patches('xs', 'ys', fill_color={'field': 'some_field', 'transform': mapper}, source=geo_src, legend_field='some_field')

some_field is a feature of each patch and geo_src is a GeoJSONDataSource. Color is mapped as required but not able to add legend. Is there a work around to get the legend?

hmanuel1 · 2020-04-17T21:39:44Z

I ran into the same issue with the custom legend for geopandas. This is working somewhat for me. It will work if you don't expect x_range or y_range to change. Trying to figure out how to prevent the "fake quads" from rendering, so it will not affect x_range or y_range when replacing the content of a figure. It will be nice that manual legends can be implemented without having to specify an actual coordinate pair in the plot.

from bokeh.io import show
from bokeh.models import LogColorMapper, Legend
from bokeh.palettes import Viridis6 as palette
from bokeh.plotting import figure
from bokeh.sampledata.unemployment import data as unemployment
from bokeh.sampledata.us_counties import data as counties

palette = tuple(reversed(palette))

counties = {
    code: county for code, county in counties.items() if county["state"] == "tx"
}

county_xs = [county["lons"] for county in counties.values()]
county_ys = [county["lats"] for county in counties.values()]

county_names = [county['name'] for county in counties.values()]
county_rates = [unemployment[county_id] for county_id in counties]
color_mapper = LogColorMapper(palette=palette)

data=dict(
    x=county_xs,
    y=county_ys,
    name=county_names,
    rate=county_rates,
)

TOOLS = "pan,wheel_zoom,reset,hover,save"

p = figure(
    title="Texas Unemployment, 2009", tools=TOOLS,
    x_axis_location=None, y_axis_location=None,
    tooltips=[
        ("Name", "@name"), ("Unemployment rate", "@rate%"), ("(Long, Lat)", "($x, $y)")
    ])
p.grid.grid_line_color = None
p.hover.point_policy = "follow_mouse"

p.patches('x', 'y', source=data,
          fill_color={'field': 'rate', 'transform': color_mapper},
          fill_alpha=0.7, line_color="white", line_width=0.5)

"""
 custom geo legend that works with this example
 with geopandas dataframe see the next commented block
"""
xq, yq = data['x'][0][0], data['y'][0][0]

"""
 for geopandas dataframe you can do the same by selecting a coord of a valid polygon
 the following line will take the coordinate pair xq and yq from a
 geopandas dataframe (gdf) polygon...
(xq, yq) = list(gdf['geometry'].values[0].envelope.centroid.coords)[0]
"""

legend_names = []
for i in range(len(palette)):
    legend_names.append(f"Legend Item {i}")

items = []
for i in reversed(range(len(palette))):
    items += [(legend_names[i], [p.quad(top=yq, bottom=yq, left=xq,
              right=xq, fill_color=palette[i])])]

p.add_layout(Legend(items=items, location='bottom_left',
             title="Unemployment Rate:"))

show(p)

raholler added the TRIAGE label Nov 11, 2019

bryevdv added type: feature and removed TRIAGE labels Nov 12, 2019

bryevdv changed the title ~~[BUG] Automatic Grouping (Python) does not work for geodata~~ Add support for legend_field with geo data Nov 12, 2019

bryevdv added the good first issue label Nov 12, 2019

bryevdv added this to the short-term milestone Nov 12, 2019

bryevdv removed the good first issue label Jun 24, 2021

prusswan mentioned this issue Feb 28, 2023

Legend not working with GeoJSONDataSource #5904

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for legend_field with geo data #9398

Add support for legend_field with geo data #9398

raholler commented Nov 11, 2019 •

edited

bryevdv commented Nov 12, 2019

bryevdv commented Nov 12, 2019

raholler commented Nov 13, 2019

bryevdv commented Nov 13, 2019

meenurajapandian commented Mar 3, 2020

hmanuel1 commented Apr 17, 2020 •

edited

Add support for legend_field with geo data #9398

Add support for legend_field with geo data #9398

Comments

raholler commented Nov 11, 2019 • edited

ALL software version info (bokeh, python, notebook, OS, browser, any other relevant packages)

Description of expected behavior and the observed behavior

Complete, minimal, self-contained example code that reproduces the issue

Stack traceback and/or browser JavaScript console output

bryevdv commented Nov 12, 2019

bryevdv commented Nov 12, 2019

raholler commented Nov 13, 2019

bryevdv commented Nov 13, 2019

meenurajapandian commented Mar 3, 2020

hmanuel1 commented Apr 17, 2020 • edited

raholler commented Nov 11, 2019 •

edited

hmanuel1 commented Apr 17, 2020 •

edited