Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Setting zoom range from Bokeh server has race condition resulting in inconsistent results #10387

Open
mattrobin opened this issue Aug 11, 2020 · 8 comments

Comments

@mattrobin
Copy link

Software version info

Python==3.7.7
bokeh==2.1.1
MacOS==10.15.6
Safari==13.1.2 and Chrome==84.0.4147.125

Description of expected behavior and the observed behavior

In a Bokeh server, in the case where a figure's zoom has already been set by the server once, setting the figure's zoom range on the server side again (during a callback handler), the displayed zoom range may become either the first set zoom or the newly set zoom.

Complete, minimal, self-contained example code that reproduces the issue

The below minimal example creates a plot and sets the zoom. On the button click, the data is updated and the zoom changed from the server side. The button click will always update the data. However, the zoom will end up in one of 3 possible outcomes. The original zoom (0, 4), the new zoom (3, 6), or a partially updated zoom (0, 6). On my machine, the resulting zoom was most often the original zoom (0, 4). Rapidly clicking the button often shows it flickering back and forth between the multiple zoom possibilities.

Run using bokeh serve example.py.

import numpy as np
import pandas as pd
from bokeh.models import Button
from bokeh.plotting import Figure, curdoc
from bokeh.layouts import column
from bokeh.models.sources import ColumnDataSource

def dashboard():
    data0 = pd.DataFrame({'y': [1, 2, 3] * 5, 'x': np.arange(15)})
    data1 = pd.DataFrame({'y': [4, 4, 5] * 5, 'x': np.arange(15) + 1})
    data_source = ColumnDataSource(data0)

    figure = Figure()
    figure.line('x', 'y', source=data_source)
    figure.y_range.start = 0
    figure.y_range.end = 4

    def updata_data():
        new_data = ColumnDataSource.from_df(data1)
        data_source.data.update(**new_data)
        figure.y_range.start = 3
        figure.y_range.end = 6

    button = Button()
    button.on_click(updata_data)

    curdoc().add_root(column(button, figure))

dashboard()

Screenshots or screencasts of the bug in action

zoom_issue

Additional notes

Suggestions of a temporary workaround to ensure the zoom update from the server always takes affect would be appreciated.

@bryevdv
Copy link
Member

bryevdv commented Aug 11, 2020

@shianiawhite If you are managing zoom levels yourself then you should not use the default auto-ranging DataRange1d. You should use a "dumb" Range1d:

figure.y_range = Range1d(0, 4)

Or this would also accomplish the same:

p = figure(..., y_range=(0,4))

@mattrobin
Copy link
Author

In my real case, I would like to be able to switch between the auto-ranging and the set ranging based on some server callback. I will have to look into if this is possible by switching out the range object on during the server side callback (I'm not sure if an object replacement will propagate to the client side).

That being said, a previous issue conversation suggests that setting the range when using a DataRange1D should still work as well. Or am I misunderstanding how the DataRange1D override is suppose to work?

@mattrobin
Copy link
Author

Also, just having tried the "dumb" Range1d approach, I noticed a related issue. If a figure JS reset is emitted using the Range1d the zoom now alternates between the original set zoom range and the updated zoom range.

import numpy as np
import pandas as pd
from bokeh.models import Button, Range1d, CustomJS
from bokeh.plotting import Figure, curdoc
from bokeh.layouts import column
from bokeh.models.sources import ColumnDataSource

def dashboard():
    data0 = pd.DataFrame({'y': [1, 2, 3] * 5, 'x': np.arange(15)})
    data1 = pd.DataFrame({'y': [4, 4, 5] * 5, 'x': np.arange(15) + 1})
    data_source = ColumnDataSource(data0)
    figure = Figure()
    figure.line('x', 'y', source=data_source)
    figure.y_range = Range1d(0, 4)

    def updata_data():
        new_data = ColumnDataSource.from_df(data1)
        data_source.data.update(**new_data)
        figure.y_range.start = 3
        figure.y_range.end = 6

    button = Button()
    button.on_click(updata_data)

    js_reset = CustomJS(args=dict(figure=figure), code="figure.reset.emit()")
    data_source.js_on_change('data', js_reset)

    curdoc().add_root(column(button, figure))

dashboard()

zoom_issue2

@bryevdv
Copy link
Member

bryevdv commented Aug 12, 2020

@shianiawhite I am trying to get at what the actual useful, reasonable use case is here. In the example above you have one callback that sets the range and and updates the data, and then another callback that immediately updates the range again because of the first data update. I can't imagine why this would ever be something that would make sense in a real application (what's the purpose of the first range update at all if another is to be triggered immediately?) Completely and entirely independent of any and all implementation details, what are you trying to have happen?

@mattrobin
Copy link
Author

mattrobin commented Aug 12, 2020

@bryevdv For calling the reset after the zoom range setting, I want everything about the figure to reset except the y-axis should use the new zoom range. For example, the x-axis would ideally still auto-range. It's not immediately obvious what all the reset resets, and rather than trying to catch all cases and missing some, it makes more sense to me that I should simply update what the y-range is suppose to reset to, and then reset everything.

For a more concrete explanation of the general goal, I'm working on an "app" which allows the user to quickly load and look through many data files. On seeing one of interest, they interact with the plot and then run some other fitting code on the data.

In more detail, I have thousands of data files containing time series data. Each data file consists of ~10,000 data points. A background process pre-loads the data for upcoming data files in the list. The figure shows the contents of 1 data file at a time. Often, the data has extreme outliers. So while the getting the full view of the data with the auto-zoom (with its nicely designed margins/padding) is often the best solution, I also have a button which toggles calculating outliers and excludes them from the zoom (along the y-axis). This is then combined with a button that swaps out the data with the contents of the next file in the list. I had originally had these plots load in separate browser pages or refresh the page, but refreshing the page for each file takes too long, and figure appearing and disappearing is too distracting (with my current setup, I can quickly scan over 3 or 4 files per second). Once the user sees interesting data, they click a few key points on the plot, which are then used to run some fitting algorithm to the data. So the most important use case for the zoom range is swapping out what data file is being looked at with outliers excluded.

@bryevdv
Copy link
Member

bryevdv commented Aug 12, 2020

So the most important use case for the zoom range is swapping out what data file is being looked at with outliers excluded.

Can you plot "regular" data with one glyph and outliers with a different glyph? Then you can configure the default auto-ranging DataRange1d range to only consider only the regular glyph, and exclude the outlier glyph, when performing auto-ranging.

@mattrobin
Copy link
Author

mattrobin commented Aug 12, 2020

So the most important use case for the zoom range is swapping out what data file is being looked at with outliers excluded.

Can you plot "regular" data with one glyph and outliers with a different glyph? Then you can configure the default auto-ranging DataRange1d range to only consider only the regular glyph, and exclude the outlier glyph, when performing auto-ranging.

That's a clever workaround. Then the toggle to exclude/include the outlier data could simply change if the outlier data is included with the auto-ranging regular glyph or the outlier glyph. I think that'd likely work for my case.

From there, you can decide whether this is still an issue worth keeping open. My reading of the DataRange1d docs suggests I should be able to set and unset the start and end ranges, and this will accordingly override and un-override the auto-ranging. I originally expected this to apply to the reset as well. An exposed reset_start and reset_end is what I'm expecting to have available. Notably, I also think the workaround you gave will still feel a bit like a workaround in my code, and may be slightly confusing when I come back to it in 3 months. But I'm happy enough with it. Thank you!

@bryevdv
Copy link
Member

bryevdv commented Aug 12, 2020

suggests I should be able to set and unset the start and end ranges, and this will accordingly override and un-override the auto-ranging

That's probably what should happen in principle, yes, however DataRange1d is actually one of the thorniest, most complicated things in Bokeh, having to mediate between all of:

  • auto-ranging some or all glyphs
  • ignoring (or not) invisible glyphs
  • user-specified overrides for start or end or both
  • reset history
  • auto-following behavior for streaming data
  • respecting hard min-interval/max-interval and min/max-bounds settings
  • adding computed range padding
  • range flipped?

at the same time. The huge number of combinations are already difficult to ensure complete testing of in static configurations, once you add in the possibility of changes over time this is just more than we have been able to cover.

That said the OP code definitely shows a bug, so I'll leave this open for at least a quick investigation of that sample in case there is some simple fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants