Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrating websocket.py into existing service #71

Closed
astrand opened this issue Mar 11, 2013 · 3 comments
Closed

Integrating websocket.py into existing service #71

astrand opened this issue Mar 11, 2013 · 3 comments
Labels
feature New feature or request fixed patchwelcome

Comments

@astrand
Copy link

astrand commented Mar 11, 2013

I'm struggling with how to integrate websocket.py with an existing Python service. I'm hoping you can provide some advice. Here's the case:

We have an existing web service written in Python, based on Cheetah templates. The engine is the Python stock SocketServer and BaseHTTPServer. A separate binary/process deals with TLS encryption and communicates with the Python process over a UNIX socket. The service is created with:

class ForkingHTTPServer(SocketServer.ForkingMixIn, SocketServer.UnixStreamServer):

The handler derives from BaseHTTPRequestHandler. All in all, pretty standard and straight forward.

Now, the question is how websocket.py could be used in such context. I've noticed that websocket.py does not use any stock service engine, but rather creates sockets, forks etc manually. In order to provide websocket.WebSocketServer with existing file descriptors etc, we would need to override WebSocketServer.start_server(), but is this really safe and intended?

Another problem is that init prints stuff to standard out. Shouldn't it use msg() instead?

In general, websocket.py is very nice, but I think it would be even better if there was a clear separation between the Websocket "protocol" and service stuff such as sockets, TLS, VNC recording etc. Thoughts?

@astrand
Copy link
Author

astrand commented Mar 11, 2013

(pasting in email reply from kanaka)

I'm struggling with how to integrate websocket.py with an existing Python service. I'm hoping you can provide some advice. Here's the case:

We have an existing web service written in Python, based on Cheetah templates. The engine is the Python stock SocketServer and BaseHTTPServer. A separate binary/process deals with TLS encryption and communicates with the Python process over a UNIX socket. The service is created with:

class ForkingHTTPServer(SocketServer.ForkingMixIn, SocketServer.UnixStreamServer):

The handler derives from BaseHTTPRequestHandler. All in all, pretty standard and straight forward.

Now, the question is how websocket.py could be used in such context. I've noticed that websocket.py does not use any stock service engine, but rather creates sockets, forks etc manually. In order to provide websocket.WebSocketServer with existing file descriptors etc, we would need to override WebSocketServer.start_server(), but is this really safe and intended?

That's a really good question. There are several factors that complicate the situation:

  • The first complication is that in order for websockify to reply to Flash policy request it needs to be the first consumer of the socket because the Flash policy request is not compatible with normal HTTP. This is less and less important over time as native support for WebSockets becomes more prevalent so eventually the Flash policy response will probably go away.
  • Another complication is that in order to support the older Hixie protocol we need to be able to read 8 raw bytes after the request headers during the WebSocket GET handshake. This is one of the reasons why I override do_GET in the WSRequestHandler. At some point soon when I get the chance I will merge your pull to remove Hixie protocol  so this will be less of an issue. The only remaining reason why I am defining WSRequestHandler is to parse headers and provide a simple embedded web server if enabled (--web).
  • I want the default mode of websockify to not serve normal HTTP requests. Being able to serve web requests is just a convenience really. However, the built-in web serving functionality is pretty basic and doesn't perform well.
  • websockify currently supports python 2.4 through 3.X in the same code base. The HTTP modules have changed subtly across those releases but I would still like to continue to support all of them if possible in the same code base (python 2.4 is still surprisingly prevalent and it's the thing that I get the most reports about if I break it).

Anyways, the biggest issue really is that websockify needs direct access to the raw socket after the handshake. This might be a lot easier to do if you drop Flash policy and Hixie. It's been a while since I looked into the structure of HTTP classes to see if there is an easy way to access the underlying socket. For example, I recall that rfile and wfile and streams over the socket and those are not sufficient. One possibility would be for the desired handler to be passed in as an argument and then monkey patched to do what is needed and capture the raw socket.

Certainly, if you came up with a patch to make websockify more cleanly integrate with existing python web serving classes, I would be happy to consider merging it into the upstream. In particular if there is a way to detect a Flash policy request (e.g. by catching an exception or something like that) then that would be ideal and allow Flash policy requests to be handled.

Another problem is that init prints stuff to standard out. Shouldn't it use msg() instead?

Yep! There are a couple other places that use print() to that should be changed also. Send me a patch :-)

In general, websocket.py is very nice, but I think it would be even better if there was a clear separation between the Websocket "protocol" and service stuff such as sockets, TLS, VNC recording etc. Thoughts?

In general, I strongly agree. I don't have time to work on this directly except to give advice, review patches, and merge in good changes. It's no longer part of my day job and I just started part-time school, so as much as I would like, I don't have much opportunity to improve websockify or noVNC. The main open source project that I work on when I do have a few moments of free time is clojurescript-in-clojurescript (https://github.com/kanaka/clojurescript [github.com]).

BTW, websockify has quite a bit of functionality so there is a fair amount of baggage to pull forward with major refactoring. However, the base functionality of websockify is actually quite simple: accept WebSocket connections, open a raw TCP socket, and then bridge between the two by framing/unframing to WebSocket. I am willing to host two python implementations for a while in the upstream repo and gradually add the current functionality to the new implementation as I (or others) have time. Just thought I would throw that out there to try and entice you ;-)

@astrand
Copy link
Author

astrand commented Mar 11, 2013

That's a really good question. There are several factors that complicate the situation:

  • The first complication is that in order for websockify to reply to Flash policy request it needs to be the first consumer of the socket because the Flash policy request is not compatible with normal HTTP. This is less and less important over time as native support for WebSockets becomes more prevalent so eventually the Flash policy response will probably go away.

Yes, in our case, we do not need support for Flash policy request handling.

  • Another complication is that in order to support the older Hixie protocol we need to be able to read 8 raw bytes after the request headers during the WebSocket GET handshake. This is one of the reasons why I override do_GET in the WSRequestHandler. At some point soon when I get the chance I will merge your pull to remove Hixie protocol so this will be less of an issue. The only remaining reason why I am defining WSRequestHandler is to parse headers and provide a simple embedded web server if enabled (--web).

I see. We do not need Hixie, but a basic web server.

  • I want the default mode of websockify to not serve normal HTTP requests. Being able to serve web requests is just a convenience really. However, the built-in web serving functionality is pretty basic and doesn't perform well.

Ok.

  • websockify currently supports python 2.4 through 3.X in the same code base. The HTTP modules have changed subtly across those releases but I would still like to continue to support all of them if possible in the same code base (python 2.4 is still surprisingly prevalent and it's the thing that I get the most reports about if I break it).

Yes, our requirements are "Python 2.4 or newer 2.X version", ie we do not support Python 3 yet.

Anyways, the biggest issue really is that websockify needs direct access to the raw socket after the handshake. This might be a lot easier to do if you drop Flash policy and Hixie. It's been a while since I looked into the structure of HTTP classes to see if there is an easy way to access the underlying socket. For example, I recall that rfile and wfile and streams over the socket and those are not sufficient. One possibility would be for the desired handler to be passed in as an argument and then monkey patched to do what is needed and capture the raw socket.

The documentation for the SocketServer module says: "For stream services, self.request is a socket object". Limiting ourselves to stream services should be a reasonable limitation. However, one problem is that BaseHTTPRequestServer et al assumes that rfile is available, since it wants to use .readline() etc. Another problem is that rfile is buffered by default, so if we switch to using the raw socket later in, we might have stale data in the rfile buffer.

In general, websocket.py is very nice, but I think it would be even better if there was a clear separation between the Websocket "protocol" and service stuff such as sockets, TLS, VNC recording etc. Thoughts?

In general, I strongly agree. I don't have time to work on this directly except to give advice, review patches, and merge in good changes. It's no longer part of my day job and I just started part-time school, so as much as I would like, I don't have much opportunity to improve websockify or noVNC. The main open source project that I work on when I do have a few moments of free time is clojurescript-in-clojurescript (https://github.com/kanaka/clojurescript [github.com]).

BTW, websockify has quite a bit of functionality so there is a fair amount of baggage to pull forward with major refactoring. However, the base functionality of websockify is actually quite simple: accept WebSocket connections, open a raw TCP socket, and then bridge between the two by framing/unframing to WebSocket. I am willing to host two python implementations for a while in the upstream repo and gradually add the current functionality to the new implementation as I (or others) have time. Just thought I would throw that out there to try and entice you ;-)

:-) I'll see what I can come up with.

astrand pushed a commit that referenced this issue Nov 28, 2013
Rename self.client to self.request, since this is what standard
SocketServer request handlers are using.
astrand pushed a commit that referenced this issue Nov 28, 2013
Move around functions and methods, so that connection-related and
server-related stuff are close together.

This patch just moves things around - it does not change anything at
all. This can be verified with:

git diff websocket.py | grep ^- | cut -c 2- | sort > removed
git diff websocket.py | grep ^+ | cut -c 2- | sort > added
diff -u removed added
astrand pushed a commit that referenced this issue Nov 28, 2013
* Move traffic_legend.

* Since websocket.WebSocketServer.socket is static, don't call it with
  self.socket.
astrand pushed a commit that referenced this issue Nov 28, 2013
refactoring. Basically, we are dividing WebSocketServer into two
classes: One request handler following the SocketServer Requesthandler
API, and one optional server engine. The standard Python SocketServer
engine can also be used.

websocketproxy.py has been updated to match the API change. I've also
added a new option --libserver in order to use the Python built in
server instead.

I've done a lot of testing with the new code. This includes: verbose,
daemon, run-once, timeout, idle-timeout, ssl, web, libserver. I've
tested both Python 2 and 3. I've also tested websocket.py in another
external service.

Code details follows:

* The new request handler class is called WebSocketRequestHandler,
  inheriting SimpleHTTPRequestHandler.

* The service engine is called WebSocketServer, just like before.

* do_websocket_handshake: Using send_header() etc, instead of manually
  sending HTTP response.

* A new method called handle_websocket() upgrades the connection to
  WebSocket, if requested. Otherwise, it returns False. A typical
  application use is:

    def do_GET(self):
        if not self.handle_websocket():
	   # handle normal requests

* new_client has been renamed to new_websocket_client, in order to
  have a better name in the SocketServer/HTTPServer request handler
  hierarchy.

* Note that in the request handler, configuration variables must be
  provided by the "server" object, ie self.server.target_host.
astrand pushed a commit that referenced this issue Nov 28, 2013
>commit b9e1295
>    Prepare for solving #71:
>
>    Rename self.client to self.request, since this is what standard
>    SocketServer request handlers are using.
astrand pushed a commit that referenced this issue Dec 17, 2013
astrand/websockify:

    Prepare for fixing #71:
    Move around functions and methods, so that connection-related and
    server-related stuff are close together.

    This patch just moves things around - it does not change anything at
    all. This can be verified with:

    git diff websocket.py | grep ^- | cut -c 2- | sort > removed
    git diff websocket.py | grep ^+ | cut -c 2- | sort > added
    diff -u removed added
@nmz787-intel
Copy link

is there an example script showing how to use this? I for example, am using Flask. Based on what I gleaned last year from this repo and maybe others, I came up with a solution that is working albeit slowly, using two threads once a wss connection comes into Flask. Then I just open a socket, and thread1 listens (recv) on it and sends to the ws, thread2 listens on the ws and forwards to the socket. Performance is low single-digit frames per second. Now that I've proven the concept works, I'm looking to clean things up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request fixed patchwelcome
Projects
None yet
Development

No branches or pull requests

3 participants