New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
--num-procs X and curdoc().session_context.request.arguments don't go well together #5582
Comments
This example is working for me on current master, with the issue you mention: some percentage of session loads fail due to the |
Thanks I'll look into that. Adding |
I have been encountering this same bug without num-procs as well. I am using python threading however for some IO (fetching data to feed to bokeh). |
@bwkeller can you provide a minimal example to duplicate what you are seeing? I am not convinced it would be the same problem, but it is impossible to investigate thoroughly without code to see and run. |
I'll see what I can do. It is some proprietary client code I'm working on, so I may need to just write something from scratch to try and reproduce. The bug is extremely intermittent for me (~less than 1% of page requests I would estimate). It's definitely the same proximate cause, curdoc().session_context.request == None, for what that's worth. |
That's definitely useful to know, thanks |
Just to add another data point, this happens to me too arbitrarily, even on apps doing nothing fancy at all; no |
I've seen this without any threading or numprocs too, and not infrequently, on a fairly simple app. If feels like there's something which will put the process into a state whereby every subsequent request fails with the error |
I think I know what's going on. First the HTML delivers the javascript to open the websocket, and includes the sessionId so that the websocket request can get access to the Session object that was created with the initial HTML request. But the the websocket request is likely to land on a different server process where that sessionId is meaningless. I think the demo apps have only been working because they don't need anything from the original HTTP request, so I'm guessing whatever process the websocket request happens to land on is just creating a brand new session object when it can't find the one specified by the id. I think this can't be solved easily. The server architecture assumes that any incoming websocket connection will be able to find the ServerSession object by its id, which only works if there's a single shared memory space for all the sessions. I don't know exactly how tornado does its forking, but I'd be really surprised if the session dictionary is somehow in a shared memory space between all the server processes. The same problem gets even worse if you're actually scaled out to multiple machines -- then the only solution is to have a shared session store like redis/memcache or something, and then the locking gets more complicated. But I think this is a symptom of a basic design flaw. People figured all this stuff out in the 1990's with web applications, so the correct design patterns should be well known. But they get harder when you're trying to support real-time updates. |
@leopd so looking at
So offhand I'm inclined to think you analysis is correct. Ping @havocp your thoughts welcome here. We have discussed the possible need for session affinity in some circumstances. It seems as though it is possible to configure some reverse proxies (e.g. nginx) for session affinity. So it is possible an answer to this could be documentation about correctly configuring all these additional tools correctly for using Bokeh in "scale out" situations. Alternatively, (or perhaps additionally) another idea would be to have FWIW the HTML connection is immediately upgraded to a websocket and all further comms happen over that. I don't think the original Bokeh session is needed, if the WS upgrade happens to hit a different server or process. Or in other words, I think creating new Bokeh sessions on demand is OK and the the specific problem to solve is restricted to communicating just the original HTML request to the the WS session. Thoughts here appreciated. Additional help from people with more expertise (more than I have, specifically) would be extremely valuable and welcome here. |
Commenting specifically in context of this issue (single-machine with fork) I think implementing a shared mapping of session IDs to HTML requests is do-able. Not entirely trivial, but certainly similar to things I have had to do in the past. |
Yeah, something here isn't quite right. Sorry about that... Session affinity / sticky sessions are required for this to work, probably. This is a consequence of keeping server-side state so apps can be written in Python (if we kept getting a new server-side context, then we'd make the app development model more complex). The original plan we discussed, IIRC, was always for scaled-out production deployments of Bokeh to have sticky sessions. There's kind of an inherent session stickiness due to the websocket (once open, it always goes to the same server process). In simple cases without sticky sessions in the reverse proxy / load balancer, the original http request creates the session state, and then that session state is pretty much discarded in favor of the state the websocket request creates. But then the websocket stays connected to the same app server node and we don't need to create session state after that second time. So as long as the app doesn't care about the state created in the initial http request, things behave as if sessions are sticky without special behavior from your reverse proxy or load balancer. This has let us kick the sticky sessions can down the road. request.arguments breaks this and now it's necessary to deal with state created by the initial http request. We actually noticed this when adding request.arguments it looks like in #4858 , I said
However obviously that wasn't thought through fully; it isn't just confusing and most of the time it does matter, if using more than one process. What I forgot when saying "that shouldn't matter" is probably that there are two requests, the http one and the websocket one, in typical usage. Maybe I was wrongly remembering that in typical usage each session is only created for the websocket. As you say, this has been figured out for web applications. However, Bokeh can't easily use the same answer because it isn't a regular web framework; in general, it hides http entirely! Bokeh gives a Python-data-science type of programming model that doesn't require people to be web devs (mess with http, JavaScript, and all that). This is done by having a big blob of Python state on the server (the Document) and syncing it to the client automatically, which of course is not how most web apps are written (they would keep state in a database, instead). request.arguments was bolted on post-initial-Bokeh design, as a little escape hatch to get a little info from http. The problem now is that this sort of cascades; once we introduce the notion that web apps have requests, then we've also introduced the issue that each request should be stateless, and now you need stuff like cookies or a database or Redis to store your state across requests... Bokeh of course doesn't support setting cookies because it doesn't use the stateless http request/response web app model in the first place. Some possible solutions:
I'm a little skeptical of automatically forwarding request.arguments around; after all, if dropping to the http layer, maybe you actually care about this request, and might even want to know that the websocket request did not have arguments. But supporting some way to copy request.arguments into a shared location could be handy. The danger is that if we go too far down the road of trying to allow writing a full-blown web app with full-blown http access in Bokeh, it will lose track of the actual original point which was to enable writing apps without learning http/javascript/etc. My instinct is probably to focus on making session affinity work well; the But I don't know. Hope the above gives someone else some ideas. Note that you certainly can today use Redis or a database with Bokeh to store stuff keyed by session ID, and that's no worse than where you'd be with Django or Flask or something, perhaps. |
Yeah, it's tricky. And I really appreciate the balance you're striking of making this stuff easy to do for python programmers who don't need to dive into these kinds of distributed systems issues. Session affinity on the load-balancer is a reasonably good band-aid. (e.g. https://www.nginx.com/products/session-persistence/ ). But it doesn't handle the I actually like
(It's a special case of the trick early web app frameworks like ASP used to serialize session down to the client.) It doesn't fully work if you need to run custom logic to render the HTML based on the request. But in the cases I've worked with so far, the HTML is just a dumb shell, and thus could be rendered without invoking any app code -- save that until the WS is opened, and as you say it's got intrinsic stickiness. It wouldn't cover page reloads with Something like a shared session store with pluggable backends -- redis/memcache/single-machine-shared-memory is the right way to really scale this out, but it gets messy when you think about supporting multiple writers which requires distributed locking or pub/sub or something. A routing layer that effectively implements stickiness is almost easier. If you can discount the multiple-writer use-case it gets a lot easier. |
One more point which my earlier comment was confusing about... If the WS request includes a copy of the query params, then a lot of common cases work and scale well, with or without sticky load balancers, and with or without |
OK something like that had occurred to me but it seemed possibly to simplistic. But if both @havocp and @leopd think it's worth pursuing it seems possible to do for Questions:
Regarding:
The app code is always re-executed to generate a new Bokeh |
Cookies would need to be scoped to the sessionId. If there was just a single cookie, then the request args would leak across different documents in the same browser. But if the cookie name includes the session Id, that would work. I'm not sure what's in the full HTML request that's not the arguments. POST body maybe? That could be relevant. Exotic things like custom request headers IMHO aren't worth worrying about. All this will work assuming that the app is effectively "read-only" on the underlying data. If creating the document has any side-effects, none of this works. But I think that's the most common scenario. |
OK, thanks for the comments. I will concentrate on just the request then as that's the scope of the original feature anyway. I will also for now just encode the request args in the template that embeds the document. There is a separate issue to explore using cookies for session id's and I can look at everything together at one time when I get to that issue.
Can you elaborate on this? Although it needs to be updated, the old "Happiness" demo showed a way to send user-specific information based on an authenticated user. If a user id of some sort is available to an app and the app makes reads, writes or updates to some external persistent store based on that user id, I don't see what the issue would be. I don't think this is the case you are referring to, but I want to make sure I understand. |
If I add these two lines:
to
examples/app/sliders.py
and run bokeh with:and then reload the page it crashes in roughly 50% of the cases with:
I guess one of the processes has the
request
not set properly. It works if I don't add--num-procs
.This happens with bokeh 0.12.3 and latest master (unrelated: master does not show the example at all, is it currently broken?)
For reference, this is the full example to reproduce:
The text was updated successfully, but these errors were encountered: