Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG/FEATURE] Category barplot does not handle non-string categories + missing error message #9132

Open
harmbuisman opened this issue Aug 1, 2019 · 2 comments

Comments

@harmbuisman
Copy link
Contributor

ALL software version info (bokeh, python, notebook, OS, browser, any other relevant packages)

bokeh.version = 1.3.0
python: 3.7.3
OS: Windows 10
browser: all

Description of expected behavior and the observed behavior

I ran into the issues when trying to adapt the Pandas example in https://bokeh.pydata.org/en/latest/docs/user_guide/categorical.html to my usecase

When running a category bar plot example on categories that are years I run into several issues that took me a lot of time to solve. I am wondering if this is a bug, but if not I like to request handling of non-string values by the method involved and better error reporting when updating data that is in a different format than what is expected.

Issue 1

index_cmap = factor_cmap('year', palette=Spectral6, factors=groups) gives the following error if groups is not integer:
Seq(String), Seq(Tuple(String, String)) or Seq(Tuple(String, String, String)), got array([2000, 2001], dtype=int64)

Expectation: I would expect factors to handle any data type, e.g. Integers as well

Issue 2

In the example below if I comment out .astype(str) in df['year'] = df['year']#.astype(str) then the example does not show any plot.

Expectation: I would expect an error message in the console telling me what goes wrong.

Issue 3

If I leave in the code below the #df['year'] = df['year'].astype(str) liine commented, then upon pressing the button the plot axes are updated, but the data in the plot is not. Even though I can verify that the data in the source is correctly set to the new data. The problem seems to by that the type of the column is not string.

Expectation: I would expect an error message in the console telling me what goes wrong.

Result without string conversion:
image

Expected result (shown when uncommenting):
image

In summary my requests:

  • handling of non-string factors
  • better error messages when
  • data versus configuration leads to not showing a plot at all
  • updates triggered by data change do not lead to changed plots

Complete, minimal, self-contained example code that reproduces the issue

from bokeh.plotting import curdoc, figure
import numpy as np
import pandas as pd
from bokeh.models import ColumnDataSource, Button
from bokeh.palettes import Spectral6
from bokeh.transform import factor_cmap
from bokeh.models.glyphs import VBar

df = pd.DataFrame(data=[[2000, 1], [2001, 2]],
                  columns=['year', 'count'])

# to fix error:
# expected an element of either Seq(String), Seq(Tuple(String, String)) or Seq(Tuple(String, String, String)), got array([2000, 2001], dtype=int64)
df['year'] = df['year'].astype(str)

groups = df['year'].unique()
index_cmap = factor_cmap('year', palette=Spectral6, factors=groups)

p = figure(plot_width=600, plot_height=300, title="Test plot",
           x_range=groups)

vbar = VBar(x='year', top='count', width=1,
            line_color="white", fill_color=index_cmap)

source = ColumnDataSource(data=df)
bar_renderer = p.add_glyph(source, vbar)

def change_data():
    df = pd.DataFrame(data=[[2002, 5], [2003, 10], [2004, 10]],
            columns=['year', 'count'])
    
    # to fix: uncomment this
    #df['year'] = df['year'].astype(str)
    groups = list(map(str, sorted(df['year'].unique())))

    index_cmap = factor_cmap('year', palette=Spectral6, factors=groups)
    vbar.fill_color = index_cmap
    source.data = df
    p.x_range.factors = groups

btn = Button(label="Change data")
btn.on_click(change_data)

curdoc().add_root(p)
curdoc().add_root(btn)

Stack traceback and/or browser JavaScript console output

When not passing string for the categories in the first place I get:
expected an element of either Seq(String), Seq(Tuple(String, String)) or Seq(Tuple(String, String, String)), got array([2000, 2001], dtype=int64)

Screenshots or screencasts of the bug in action

@bryevdv
Copy link
Member

bryevdv commented Aug 1, 2019

HI @harmbuisman thanks for your comments. I appreciate your frustrations but unfortunately there are some obstacles. In particular, Bokeh is not really a Python library. Most of the actual work is done in JavaScript. This often makes error reporting:

  • difficult in the case of the Bokeh server where there is bidirectional communication between the browser and a Python process to receive errors, and
  • often impossible in the case of standalone output, where the content is shipped off to a browser and there is no Python process to report any errors back to at all (the only place errors can go is the JavaScript console)

That said we do have both some validation checks at serialization time, as well as a types property system. The message you quote:

expected an element of either Seq(String), Seq(Tuple(String, String)) or Seq(Tuple(String, String, String)), got array([2000, 2001], dtype=int64)

Is actually the error message you are asking for, in my opinion. It's telling you exactly what went wrong, namely: the expected value should be a list (sequence) of strings, but you provided an array of ints, which is not allowed. This level of detail about types and values is actually more than almost any typical Python library provides, in my experience, so I am not sure what else we can do. Do you have any specific suggestions to improve the kind of property validation message we could consider?

As for non-string categories, this proposal has been discussed before, and rejected. If someone wants to work on it, submit a proposal and a PR to implement, we would definitely consider it. But it's unlikely to be prioritized but the core team.

@harmbuisman
Copy link
Contributor Author

Hi Bryan, thanks for your extensive comments. Checking the javascript console is a good pointer. I will check that in future. For the issue above I get the message "slice is not a function", which is a bit hard to interpret. Uncaught typeerror seems to suggest something type related. If it is possible to catch that with a more meaningful error message that would help.

image

I understand your remarks and see that this is not a prio for the core team. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants