-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] DataTable column sort not working with NaNs #10251
Comments
Table sorting is done here: bokeh/bokehjs/src/lib/models/widgets/tables/data_table.ts Lines 83 to 104 in a68a6f8
It's definitely not nan-aware. This would be a nice first issue for a new contributor to tackle. |
I am also experiencing this problem. Probably does not add much information but I came up with this minimal example: import numpy as np
from bokeh.models.widgets import DataTable, TableColumn, Div
from bokeh.layouts import column, gridplot, row
from bokeh.plotting import figure, output_file, show
from bokeh.models import ColumnDataSource, HoverTool
def main():
data = dict(
dates=[-0.816, -0.455, np.nan, np.nan, np.nan, -0.057, -0.079, -0.090, -1.999, - 1.329],
downloads=[i for i in range(10)],
)
source = ColumnDataSource(data)
columns = [
TableColumn(field="dates", title="Date"),
TableColumn(field="downloads", title="Downloads"),
]
data_table = DataTable(source=source, columns=columns, sizing_mode='stretch_height', sortable=True)
c = row(children=[data_table], sizing_mode='stretch_height', min_height=10)
output_file(r"C:\temp\bla.html")
show(c)
if __name__ == '__main__':
main() Clicking the column |
A preliminary PR fixing the sorting algorithm is available in PR #10318. However, one important thing to note is how NaNs are currently handled in bokeh. Data like: data = dict(dates=[-0.816, -0.455, np.nan, np.nan, np.nan, -0.057, -0.079, -0.090, -1.999, - 1.329]) is encoded as: [-0.816, -0.455, "NaN", "NaN", "NaN", -0.057, -0.079, -0.090, -1.999, - 1.329] which results in textual comparisons (lexicographic), instead of numerical. To get the desired behavior, one needs to use numpy arrays: data = dict(dates=np.array([-0.816, -0.455, np.nan, np.nan, np.nan, -0.057, -0.079, -0.090, -1.999, - 1.329])) This serializes as a binary array and deserializes as a typed array in bokehjs with NaNs and infinities preserved. This dichotomy is unfortunate and I will try to get this resolved ASAP (hopefully in 2.2). |
What is your plan here? This dichotomy exists because the JSON standard neglected to incorporate nan and inf directly. Do you propose to automagically wrap all plain lists in numpy arrays so that either the binary protocol or base64 encoding will be used? |
I will start another issue to explain how I would like to proceed, especially that arrays with NaNs is just a special case of a broader problem with serialization. |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Versions: Python 3.7.3; Bokeh 2.1.1; Numpy 1.18.5; Mac OS X 10.15.5; Chrome: Version 83.0.4103.116 (Official Build) (64-bit)
Issue: DataTable sorting not working as expected with NaNs. After sorting, NaNs are scattered through the column rather than being moved to top or bottom.
Example
Console output
The text was updated successfully, but these errors were encountered: