-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Add DataModel base class #3674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
hi @guillochon, thanks for the report. I agree, I can see the tooltips but not the lines. Without the code that was used to produce the two examples "broken" and not it's impossible to know what's going on - I've certainly not seen this before and the console's not telling me anything useful if that helps. |
The code is what I posted here: #3671. The code used to produce both examples is identical, I literally just ran the code and ran it again. Sometimes the issue occurs, sometimes not. It's not deterministic! |
I had misunderstood - I thought it was at runtime - when the html loaded, not running your bokeh script. I am not sure what's going on. My only guess is that |
So I got rid of Is it possible for someone to look at the broken link above to see the ordering of the output? It is different than the "good" plot, suggesting that the output order is affecting the display of the data. I have no idea how to fix this and it makes automation of the script impossible (I have to babysit it and make sure it properly generates the plots). |
Possible clue: the working HTML page is 10x larger than the broken one. |
@havocp Since I first posted this issue more data was added to the plot. The file sizes were identical previously. I'll try to grab a broken example again the next time the script executes. |
This is kind of a deal-breaker for me for Bokeh, I can't have something that doesn't plot the data half the time. Does anyone have any idea of things I can try to fix this? What is Bokeh even doing here that would result in the non-deterministic output I'm seeing? Can I force Bokeh to output in a particular order, if that's what's causing the issue? |
Do you have time to try the cleaned json diff I suggested? ideally two pages with the exact same data but one doesn't work ... I have no idea what would cause this so step 1 for me would be to see what's different about the working and broken pages, by getting things human readable then diffing. also maybe given a script to make the pretty diff you could automate generating a failure because you could regenerate, remove noise such as changed uuids, and then if the two pages aren't identical you know one of them is broken. There's a "bokeh json" command too so if the problem is the json varies we could get the html out of the picture. |
do you have a way to freeze the data so you can take variation in data out of the picture for debugging ? |
Also useful would be to try to minimize a test case that's easy for anyone to run (includes all needed data files, etc). |
OK, I have two files that should not change now, one broken, one working. A diff is just a mess, the JSON is in a completely different order: https://sne.space/sne/SN1990aa-broken.html |
I can't work on this tonight but I'd probably write a little json cleaner script to sort the keys and change all the uuids to just the string "uuid" or something and then pretty print. |
if json.dumps doesn't have a sort keys option it might keep the order if you hand it ordereddict instances I don't know... |
another possibility is the different order is precisely the problem I suppose! maybe the wrong order of building the JavaScript models breaks. but that's pure speculation so I'd probably rule out other differences first if it were me |
Possibly the problem (in a column data source in the bad file):
There is an assumption that is not well enforced (I am working on this right now as a matter of fact) that all columns in a column data source be the same length. The best mental model for a column data source is a "cheap dataframe" (i.e., a collection of series all of the same length). I would expect that a glyph that refers to any of the "short" columns like this would only plot one point, which for a line means plotting nothing at all. |
If you want to have fixed values for a certain field you can set that field as a value explicitly:
The main thing is not to have arrays of different lengths in a column data source. I'm working on a PR in conjunction with the streaming interface that enforces and regulates the column-length assumption much more rigorously and loudly. |
Another possibility, in the bad file:
In the "good" file, the |
More ideas, there is "xs" and "ys" in the bad file that some glyph refers too, but they are empty:
Edit: seems to happen in the good file too, though |
As an aside it suddenly seems like a |
Thank you very much, @guillochon, for asking this question! I spent almost 2 days wondering, researching, trying different things thinking I probably didn't code the datasource updater function or the session setup correctly (and pulling out my hair when nothing seemed to work). Although the manifestation of the bug is slightly different, but the root cause is the same: x-axis & y-axis arrays cannot have different number of elements. In my case, my real-time line plots took a few minutes to show up & reflect the real-time changes in the y-axis data, and it was consistently doing this. The same code was working in Bokeh v0.10.0 but starting behaving this way once I migrated it to v0.11.0, although v0.10.0 was throwing an error about non-equal column data lengths, but the latest version does not. |
I don't recall an explicit decision to turn off the "unequal length" warning (which doesn't mean there wasn't one, but I don't happen to recall any), so maybe that is a regression. It's also possible this is not the same problem as @guillochon but hopefully it might be. I'm working on some cleanup and refactoring that will definitely make things more chatty on the JS side when this situation occurs, and we will revisit the python side warning as well. |
Just to clarify again, the bad and the good files are produced by identical code, using identical inputs. I re-ran the code ten times in a row, sometimes the output is "good," sometimes it is "bad." The code runs in a loop and produces a few thousand html files each time, and it appears that if one of the files is bad, they are all bad, and if one of the files is good, they're all good. So what I'm doing now is running the code, looking at the first html file, and if it's "bad" I kill the program and restart it until the first output is "good," then all my outputs are good. This is extremely weird behavior. The only way I think this can occur is if Bokeh's output ordering is not fixed, that's I think the real the underlying issue here, I do not understand how the output ordering is not guaranteed. If it's the unequal column data source thing, the way its manifesting "good" and "bad" outcomes is non-deterministic, I cannot predict whether the output will be "good" or "bad" until I've run the code and looked at the result. I will give that a try but I am skeptical that this is the real issue. |
I've forced all of the data fields to be of equal length and am re-running multiple times in a row now to see if I can trigger the "bad" behavior. So far it's producing only "good" output, but in my experience it may take a few tries to trigger. So a thought on perhaps why the column length mismatch causes this: If one is defining multiple fields in a column data sources, the order that the elements appear in that column data source is unpredictable because internally it's a "dict" object (which doesn't guarantee key order). So, if one of my one-element arrays happens to be ordered first, then it read the data as having a length of "1", and doesn't plot it. I don't understand your solution though @bryevdv, I cannot define "binsize" for instance in the way that you did in your example (only "x" and "y" and other variables that |
If we can capture bad and good and make a human readable diff I hope things will become clear... were the column lengths only unequal in "bad"? was anything else different from bad to good? I have no guess why column lengths would vary from run to run in a way that would be caused by bokeh. If the column lengths fix doesn't work I recommend a systematic approach (find the diff and root-cause it) rather than trying things haphazardly. The time to solution will be much more deterministic that way. |
@havocp They were unequal in both cases, because the code and inputs are completely identical, no changes at all in the code. I think I see why the unequal column lengths might be causing this, and it's because of |
I understand there are no code changes - I'm talking about understanding the differences in output to be clear. is dict order the only difference in good vs bad output ? or are there others? I think Python may randomize hash keys and thus dict order as a security measure. I have some fuzzy memory of that. |
We're working on a "namespace" model specifically to store bits of state like this. In the mean time, since you seem to have a CustomJS callback, you can execute arbitrary JS code so you can attach these bits of data anywhere really. Given what is known about dict key ordering, I am more confident this is the issue. For reference, here is the code in the source that reports the "length": If the key order is not predictable, that unpredictability will clearly be propagated there. This length value is used by both glyph renderers and properties to condition the render loops. As I said, expect the data source assumptions to be enforced much more rigidly in the future.
Well as I also mentioned, there used to be a very loud warning, still need to look into what happened to it. |
OK, there was a recent discussion on the Discourse that surfaced a better approach to this. The full discussion is there but I wanted to record the basic idea here, which is simply that we provide a mechanism to turn new Python Bokeh models that only require properties and no other JS implementation in to custom extensions automatically with no JS required from the users. This would enable users to create nicely typed custom scratch spaces that automatically sync and trigger events in exactly the same way as existing models. This is a much better approach, both for users and us to implement. Some questions: Does this happen truly automagically for any Model subclass that is not a built-in? I.e. Does just defining
suffice, with nothing else needed? Or there could be a special parent class that subclasses of automatically get synthesized:
Or, we could require an explicit registration step for "data models" that are defined as regular I am inclined to prefer 2 or 3 I think. For example in the docs we define |
I think |
Noting that as part of this work, since it will be necessary to allow
we should ensure that any model (e.g. sources) an also be added (maybe that's already possible) |
I need very much need the Dynamic DataModel. I need it to add bidirectional communication to html elements included via the Jinja Template or via https://github.com/paulopes/panel-components which is a wrapper of the Jinja Template into something much my in the style of Dash layouts or R Shiny. I have also requested it for Panel here holoviz/panel#1612. For now I don't know how to create a dynamic DataModel from a But any kind of guidance in the right direction, some POC code or an actual Bokeh Data Model would be very much appreciated. @philippjfr said that @mattpap already had experimented/ worked at bit with this? If some code or learning could be shared that would be great. As inspiration of why this is valuable take a look at this example where we just need to be able to add bidirectional communication to a fast-button html element. https://github.com/paulopes/panel-components/blob/master/examples/fast_hello_world.py |
@mattpap . I can see you have done a pull request to solve this. FYI @philippjfr I have also created something. I guess it's different then what you have done. But you can see the details of what I need and what I have created here holoviz/panel#1612 (comment) The implementation is in the |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
See #3674 (comment) for a new, better idea for this feature.
Hi all, I'm having an issue where line plots will sometimes be invisible when rendered. This error seems to be non-deterministic; when I run the same code repeatedly, the plots will sometimes render as normal, and sometimes render invisibly.
An example page that is currently showing these "invisible" plots is available here (2nd plot from the top): https://sne.space/sne/SN1993J-broken.html. One thing you can notice is that hovering over the plot still produces tooltips, and an examination of the data within the page source reveals that the
line_color
attributes are non-white, and that the data is actually available. But for some reason, the lines are not visible!And here's the exact same page when the render does work: https://sne.space/sne/SN1993J.html. The file sizes are identical but the file contents are not identical. It seems like it might be a different command order in the two files?
Anyone know what's going on?
The text was updated successfully, but these errors were encountered: