You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running a category bar plot example on categories that are years I run into several issues that took me a lot of time to solve. I am wondering if this is a bug, but if not I like to request handling of non-string values by the method involved and better error reporting when updating data that is in a different format than what is expected.
Issue 1
index_cmap = factor_cmap('year', palette=Spectral6, factors=groups) gives the following error if groups is not integer:
Seq(String), Seq(Tuple(String, String)) or Seq(Tuple(String, String, String)), got array([2000, 2001], dtype=int64)
Expectation: I would expect factors to handle any data type, e.g. Integers as well
Issue 2
In the example below if I comment out .astype(str) in df['year'] = df['year']#.astype(str) then the example does not show any plot.
Expectation: I would expect an error message in the console telling me what goes wrong.
Issue 3
If I leave in the code below the #df['year'] = df['year'].astype(str) liine commented, then upon pressing the button the plot axes are updated, but the data in the plot is not. Even though I can verify that the data in the source is correctly set to the new data. The problem seems to by that the type of the column is not string.
Expectation: I would expect an error message in the console telling me what goes wrong.
Result without string conversion:
Expected result (shown when uncommenting):
In summary my requests:
handling of non-string factors
better error messages when
data versus configuration leads to not showing a plot at all
updates triggered by data change do not lead to changed plots
Complete, minimal, self-contained example code that reproduces the issue
from bokeh.plotting import curdoc, figure
import numpy as np
import pandas as pd
from bokeh.models import ColumnDataSource, Button
from bokeh.palettes import Spectral6
from bokeh.transform import factor_cmap
from bokeh.models.glyphs import VBar
df = pd.DataFrame(data=[[2000, 1], [2001, 2]],
columns=['year', 'count'])
# to fix error:
# expected an element of either Seq(String), Seq(Tuple(String, String)) or Seq(Tuple(String, String, String)), got array([2000, 2001], dtype=int64)
df['year'] = df['year'].astype(str)
groups = df['year'].unique()
index_cmap = factor_cmap('year', palette=Spectral6, factors=groups)
p = figure(plot_width=600, plot_height=300, title="Test plot",
x_range=groups)
vbar = VBar(x='year', top='count', width=1,
line_color="white", fill_color=index_cmap)
source = ColumnDataSource(data=df)
bar_renderer = p.add_glyph(source, vbar)
def change_data():
df = pd.DataFrame(data=[[2002, 5], [2003, 10], [2004, 10]],
columns=['year', 'count'])
# to fix: uncomment this
#df['year'] = df['year'].astype(str)
groups = list(map(str, sorted(df['year'].unique())))
index_cmap = factor_cmap('year', palette=Spectral6, factors=groups)
vbar.fill_color = index_cmap
source.data = df
p.x_range.factors = groups
btn = Button(label="Change data")
btn.on_click(change_data)
curdoc().add_root(p)
curdoc().add_root(btn)
When not passing string for the categories in the first place I get:
expected an element of either Seq(String), Seq(Tuple(String, String)) or Seq(Tuple(String, String, String)), got array([2000, 2001], dtype=int64)
Screenshots or screencasts of the bug in action
The text was updated successfully, but these errors were encountered:
HI @harmbuisman thanks for your comments. I appreciate your frustrations but unfortunately there are some obstacles. In particular, Bokeh is not really a Python library. Most of the actual work is done in JavaScript. This often makes error reporting:
difficult in the case of the Bokeh server where there is bidirectional communication between the browser and a Python process to receive errors, and
often impossible in the case of standalone output, where the content is shipped off to a browser and there is no Python process to report any errors back to at all (the only place errors can go is the JavaScript console)
That said we do have both some validation checks at serialization time, as well as a types property system. The message you quote:
expected an element of either Seq(String), Seq(Tuple(String, String)) or Seq(Tuple(String, String, String)), got array([2000, 2001], dtype=int64)
Is actually the error message you are asking for, in my opinion. It's telling you exactly what went wrong, namely: the expected value should be a list (sequence) of strings, but you provided an array of ints, which is not allowed. This level of detail about types and values is actually more than almost any typical Python library provides, in my experience, so I am not sure what else we can do. Do you have any specific suggestions to improve the kind of property validation message we could consider?
As for non-string categories, this proposal has been discussed before, and rejected. If someone wants to work on it, submit a proposal and a PR to implement, we would definitely consider it. But it's unlikely to be prioritized but the core team.
Hi Bryan, thanks for your extensive comments. Checking the javascript console is a good pointer. I will check that in future. For the issue above I get the message "slice is not a function", which is a bit hard to interpret. Uncaught typeerror seems to suggest something type related. If it is possible to catch that with a more meaningful error message that would help.
I understand your remarks and see that this is not a prio for the core team. Thanks
ALL software version info (bokeh, python, notebook, OS, browser, any other relevant packages)
bokeh.version = 1.3.0
python: 3.7.3
OS: Windows 10
browser: all
Description of expected behavior and the observed behavior
I ran into the issues when trying to adapt the Pandas example in https://bokeh.pydata.org/en/latest/docs/user_guide/categorical.html to my usecase
When running a category bar plot example on categories that are years I run into several issues that took me a lot of time to solve. I am wondering if this is a bug, but if not I like to request handling of non-string values by the method involved and better error reporting when updating data that is in a different format than what is expected.
Issue 1
index_cmap = factor_cmap('year', palette=Spectral6, factors=groups) gives the following error if groups is not integer:
Seq(String), Seq(Tuple(String, String)) or Seq(Tuple(String, String, String)), got array([2000, 2001], dtype=int64)
Expectation: I would expect factors to handle any data type, e.g. Integers as well
Issue 2
In the example below if I comment out .astype(str) in df['year'] = df['year']#.astype(str) then the example does not show any plot.
Expectation: I would expect an error message in the console telling me what goes wrong.
Issue 3
If I leave in the code below the #df['year'] = df['year'].astype(str) liine commented, then upon pressing the button the plot axes are updated, but the data in the plot is not. Even though I can verify that the data in the source is correctly set to the new data. The problem seems to by that the type of the column is not string.
Expectation: I would expect an error message in the console telling me what goes wrong.
Result without string conversion:
Expected result (shown when uncommenting):
In summary my requests:
Complete, minimal, self-contained example code that reproduces the issue
Stack traceback and/or browser JavaScript console output
When not passing string for the categories in the first place I get:
expected an element of either Seq(String), Seq(Tuple(String, String)) or Seq(Tuple(String, String, String)), got array([2000, 2001], dtype=int64)
Screenshots or screencasts of the bug in action
The text was updated successfully, but these errors were encountered: