use SockJS instead of pure WebSockets #2321

Closed
wants to merge 10 commits into
from

Projects

None yet

8 participants

@minrk
Member
minrk commented Aug 21, 2012

Should work in more environments than plain websockets.

Still some work to do, checking out authentication, etc., but the basics definitely work.

Third-party code added: sockjs-tornado (in IPython.external), sockjs-client (in static/sockjs).

@travisbot

This pull request passes (merged 99b9ae77 into 3c8d448).

@minrk
Member
minrk commented Aug 21, 2012

This was meant to be just an experiment to see what would be involved in switching over to SockJS, and it turns out it was super easy. After dropping in the JS and tornado code, it was only about an hour of fiddling to make the changes necessary for it to be basically functional.

@travisbot

This pull request passes (merged 68a2c5fa into 3c8d448).

@minrk
Member
minrk commented Aug 21, 2012

I tried my own attack for unauthorized execution, and it seemed to be behaving exactly as expected (websocket connection closes with wrong cookie password), so no change seems necessary there.

I tested, and can confirm that execution works in IE9 (CSS is obviously wonky), as well as behind an Apache reverse proxy.

@Carreau
Member
Carreau commented Sep 4, 2012

@minrk, do you want this one merge or was is just for experimenting ?
If the former could you rebase ?

@minrk
Member
minrk commented Sep 4, 2012

I think we should do this, but it is certainly open to discussion, as it is a significant change.

@travisbot

This pull request passes (merged 1bcc1b6 into c678a87).

@travisbot

This pull request passes (merged e2f8f50 into c678a87).

@tkf
tkf commented Sep 4, 2012

Is IPython going to stop supporting websocket when this is merged? I know it sounds selfish, but let me say that I really hope you don't merge this if that is the case, because my Emacs client will not work.

@minrk
Member
minrk commented Sep 4, 2012

@tkf - SockJS still uses websockets when they are available, but your client will definitely need to be updated since there is an extra handshake, etc.

@minrk
Member
minrk commented Sep 4, 2012

@tkf - in fact, it looks to be just a trivial URL change.

@tkf
tkf commented Sep 4, 2012

Thanks for the link. I should have read the page more carefully. So I guess all I need to do is to connect to something like

ws://127.0.0.1:8888/kernels/UUID/shell/websocket

instead of

ws://127.0.0.1:8888/kernels/UUID/shell
@tkf
tkf commented Sep 4, 2012

BTW, in the page you mentioned it says:

You can't open more than one SockJS connection to one domain at the same time due to the browsers limit of consurrent connections

Is this limitation fine for IPython? It needs two websocket connections (shell and iopub), right?

@minrk
Member
minrk commented Sep 4, 2012

Is this limitation fine for IPython? It needs two websocket connections (shell and iopub), right?

I saw that, but I don't fully understand it, as everything appears to work fine. It may warrant further investigation, as that limitation probably only comes up when websockets are unavailable (even though I have tested this as well, and it still works everywhere I have tried).

@minrk
Member
minrk commented Sep 4, 2012

Added merged subclass, so that it can be used with only one SockJS connection (separate websocket connections still work the same).

@travisbot

This pull request passes (merged bbfe2c7 into c678a87).

@ellisonbg
Member

So how was this working before when you tested it?

On Tue, Sep 4, 2012 at 4:22 PM, The Travis Bot notifications@github.comwrote:

This pull request passeshttp://travis-ci.org/ipython/ipython/builds/2338061(merged
bbfe2c7 bbfe2c7 into c678a87c678a87
).


Reply to this email directly or view it on GitHubhttps://github.com/ipython/ipython/pull/2321#issuecomment-8282544.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@minrk
Member
minrk commented Sep 5, 2012

Before, it worked no differently to current master, just with SockJS handling fallback if websockets were unavailable. Now a single web socket connection will handle everything instead of two. I was able to artificially induce the failing case pointed out by @tkf with some custom config in Firefox (it does not seem possible with default config in FF or Chrome).

@ellisonbg
Member

I haven't had a chance to look at the code. Does this always use 1
websocket, or just some of the time? Do you think there are disadvantages
to using just one of them? We have other channels to hook up (like the
stdin) and I don't want to make our lives more difficult later.

On Tue, Sep 4, 2012 at 11:33 PM, Min RK notifications@github.com wrote:

Before, it worked no differently to current master, just with SockJS
handling fallback if websockets were unavailable. Now a single web socket
connection will handle everything instead of two. I was able to
artificially induce the failing case pointed out by @tkfhttps://github.com/tkfwith some custom config in Firefox (it does not seem possible with default
config in FF or Chrome).


Reply to this email directly or view it on GitHubhttps://github.com/ipython/ipython/pull/2321#issuecomment-8288940.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@minrk
Member
minrk commented Sep 5, 2012

Kernel.js only uses one websocket connection. It's pretty trivial to switch on a key in the message, so I don't think there is any real difficulty extending multiple channels on a single connection. That said, I wouldn't do this if SockJS didn't require it, so if we stick to requiring pure websockets, a 1:1 channel system is cleaner.

The clear advantage of this: it works behind proxies, etc. in the many environments where websockets don't. It's simply a matter of whether we want to wait for systems to support websockets or not, and how long we expect that to take. Making the move to sockjs is extremely easy, so we can do it at any later point if we want to give up on waiting, etc.

I didn't set out to make a PR, but just reading about how it might work resulted in fully working code, so I thought it might be worth it.

@fperez
Member
fperez commented Sep 6, 2012

On Wed, Sep 5, 2012 at 4:01 PM, Min RK notifications@github.com wrote:

The clear advantage of this: it works behind proxies, etc. in the _many_environments where websockets don't.

I think that's an advantage we shouldn't discount lightly: schools, an
environment where we really want ipython to work very well, often have
proxies. Since the added complexity doesn't seem that significant (a key
switch on messages, if I understand @minrk correctly), I'm slightly in
favor.

@ellisonbg
Member

I am not as worried about older browsers, but the proxy issue is huge.
That will also affect many companies (including MSFT). I will try to have
a look at the code, but I think we should probably go in this direction.

On Wed, Sep 5, 2012 at 7:17 PM, Fernando Perez notifications@github.comwrote:

On Wed, Sep 5, 2012 at 4:01 PM, Min RK notifications@github.com wrote:

The clear advantage of this: it works behind proxies, etc. in the
_many_environments where websockets don't.

I think that's an advantage we shouldn't discount lightly: schools, an
environment where we really want ipython to work very well, often have
proxies. Since the added complexity doesn't seem that significant (a key
switch on messages, if I understand @minrk correctly), I'm slightly in
favor.


Reply to this email directly or view it on GitHubhttps://github.com/ipython/ipython/pull/2321#issuecomment-8320338.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@minrk
Member
minrk commented Sep 7, 2012

One more thing I can change:

For backwards compatibility, the individual IOPub / Shell channels still exist, and are usable as pure websockets (not tested, but should be true according to sockjs spec). However, they, too, are SockJS, so the raw websocket URLs will be <kernel-id>/shell/websocket instead of <kernel-id>/shell. If I refactored it a little bit, I could make the original urls available as plain websocket handlers again, so only the merged single handler uses SockJS.

@ellisonbg
Member

I don't view the URL scheme of web service of the notebook as being
anywhere close to stable. I think it it definitely to early to worry about
backwards compatibility in this area so lets just use the best design we
can. As we add multiuser capabilities and live-notebook sharing, all of
these URLs are going to change anyways. If I understand things correctly,
we should probably just move everything to a single websocket/sockjs
connection and get rid of the two socket URLs, etc.

On Fri, Sep 7, 2012 at 11:16 AM, Min RK notifications@github.com wrote:

One more thing I can change:

For backwards compatibility, the individual IOPub / Shell channels still
exist, and are usable as pure websockets (not tested, but should be true
according to sockjs spec). However, they, too, are SockJS, so the raw
websocket URLs will be /shell/websocket instead of
/shell. If I refactored it a little bit, I could make the
original urls available as plain websocket handlers again, so only the
merged single handler uses SockJS.


Reply to this email directly or view it on GitHubhttps://github.com/ipython/ipython/pull/2321#issuecomment-8373872.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@Sesshomurai

Hi, I'm interested in trying this, but am still a git noob. How do I merge this onto a ipython and what version? Thank you for this work. We need it badly to keep ipython alive for us!

@Carreau
Member
Carreau commented Sep 15, 2012

Hi @Sesshomurai ,

Which version are you currently using ?
Merging is not always straitforward.

What you can do is try directly the state of this branch, that for example you can download it as a zip file autogenerated by github. I got the link by going to min's profile https://github.com/minrk/ipython go to branches, then sockjs then zip link in the headers.

If you want to do it with git, add a remote

git remote add minrk https://github.com/minrk/ipython.git

fetch it

git fetch minrk

now you can try to merge minrk/sockjs

git merge minrk/sockjs

But please be sure to understand what you do, or ask for clarification.

You could also get a look at https://gist.github.com/3342247
which allow you to simply refer to pull request by their number. which would be

git merge origin/pr/2321

don't hesitate to ask if you need help, and also https://help.github.com/ is a good place to start learning git.

@Sesshomurai

Thanks so much for the response. I don't need to merge if its not necessary, I just want a complete ipython with this capability embedded. So I will try the autogenerated zip and see.

@Sesshomurai

I also didn't understand this part "...then sockjs then zip link in the headers...." Do I need to layer in other code from elsewhere?

@Carreau
Member
Carreau commented Sep 15, 2012

Longer version would have been:

Go to the 'branch' tab. Select the 'sockjs' branch, now the 'zip' button in
the github header bar allow you to download a snapshot of the sockjs
branch. This is where i got the link i embed in my previous answer.

Does it make more sens?
Le 15 sept. 2012 16:01, "Sesshomurai" notifications@github.com a écrit :

I also didn't understand this part "...then sockjs then zip link in the
headers...." Do I need to layer in other code from elsewhere?


Reply to this email directly or view it on GitHubhttps://github.com/ipython/ipython/pull/2321#issuecomment-8584966.

@jasongrout
Member

I haven't noticed this issue before. Just FYI, we use SockJS in the Sage Cell server to communicate with an IPython kernel. I haven't had time to look at this implementation, but just in case it's useful, here are quick links to our code:

javascript:

https://github.com/sagemath/sagecell/blob/master/static/compute_server.js#L85

https://github.com/sagemath/sagecell/blob/master/static/compute_server.js#L989

server:
https://github.com/sagemath/sagecell/blob/master/handlers.py#L105

@Carreau
Member
Carreau commented Sep 29, 2012

Do we have any good external feedback of this PR improving things ?
I would like to move forward and merge it, but as it changes the notebook urls a little, I would prefer making an announcement on the mailing list before merging so that external client like the emacs one, could be updated.

Thoughts ?

@Sesshomurai

Hi,
For us, it doesn't fix our problem. We have a rather strict network
with firewalls, protocol rules, etc.
It's not websocket friendly. In the SockJS baseline, it seems it can
make a ws:// connection but the
protocol is interrupted somehow by our network and the ipython server
throws error 500 and the
ws protocol is broken. I can't say for sure how the network is
interfering with ws:// connections though.

As was mentioned previously, the client code attempts to use ws://
first, then fall back, but in our
unique situation it gets past the connection but it closes after and
thus no kernel executions
occurr, but SockJS is still using broken ws://.

It was mentioned that we might need to modify the client JS for SockJS
to enforce a policy,
but this is not ideal for us. Is there a server side fix that enforces
the desire for sockjs to use
a single port for its messaging without ever trying ws:// ? Or some
elegant solution otherwise?

On Sat, 2012-09-29 at 06:01 -0700, Bussonnier Matthias wrote:

Do we have any good external feedback of this PR improving things ?
I would like to move forward and merge it, but as it changes the
notebook urls a little, I would prefer making an announcement on the
mailing list before merging so that external client like the emacs
one, could be updated.

Thoughts ?


Reply to this email directly or view it on GitHub.

@Sesshomurai

I should add though, that we have no objection to merging it now since
we have a rather unique situation and
will continue to give feedback and try to resolve.

On Sat, 2012-09-29 at 06:01 -0700, Bussonnier Matthias wrote:

Do we have any good external feedback of this PR improving things ?
I would like to move forward and merge it, but as it changes the
notebook urls a little, I would prefer making an announcement on the
mailing list before merging so that external client like the emacs
one, could be updated.

Thoughts ?


Reply to this email directly or view it on GitHub.

@Carreau
Member
Carreau commented Sep 29, 2012

Is there a server side fix

I think that in
IPython/frontend/html/notebook/notebookapp.py L 153 and following ,
change to

SockJSRouter(IOPubHandler, r"/kernels/%s/iopub" % _uuid_regex, dict(disabled_transports=['websocket']))

according to https://github.com/mrjoes/sockjs-tornado , should disable websocket.

It could be made configurable, but it is not right now.

@tkf
tkf commented Sep 30, 2012

I am trying to connect IPython with SockJS using pure websocket in JS, but iopub channel does not respond.
tkf@9a570cd

  • When I execute some code in notebook at the first time, prompt stays in *.
  • When I further execute some code in notebook, prompt is updated but there is no output.
  • Other messages in shell channel such as code completion work.

Do you have any idea why?

@Carreau
Member
Carreau commented Sep 30, 2012

I don't think it is possible tu use websocket directly with sockjs.
I think it is doing it's own kind of handshake do decide how to comunicate.
You should use smth like

this.shell_channel = new SockJS(ws_url + "/sock");

and it will decide what protocol tu use.
If you want to specify that you only want websocket, there is a way to white/blacklist protocol but this would be in sockjs documentation.

@tkf
tkf commented Sep 30, 2012

As @minrk said, it should be possible according to the README:

Although the main point of SockJS it to enable browser-to-server connectivity, it is possible to connect to SockJS from an external application. Any SockJS server complying with 0.3 protocol does support a raw WebSocket url. The raw WebSocket url for the test server looks like:

  • ws://localhost:8081/echo/websocket

You can connect any WebSocket RFC 6455 compliant WebSocket client to this url. This can be a command line client, external application, third party code or even a browser (though I don't know why you would want to do so).

from https://github.com/sockjs/sockjs-client#connecting-to-sockjs-without-the-client

@Carreau
Member
Carreau commented Sep 30, 2012

You can't open more than one SockJS connection to one domain at the same time due to the browsers limit of concurrent connections (this limit is not counting native websockets connections).

So could it be that you can't open both shell and iopub and always have to multiplex the 2 because in the end it is sockjs that handle reply on the server ?

@tkf
tkf commented Sep 30, 2012

I think this restriction is about SockJS when using non-websocket backend, not websocket itself.

The second last commit (minrk@e2f8f50) by @minrk uses two SockJS and it works fine. As document says

Under the hood SockJS tries to use native WebSockets first.

so I think using the raw websocket should work.

Also, this problem happens in Emacs which has no such restriction (I can open ten notebooks and kernels work fine).

@Carreau
Member
Carreau commented Sep 30, 2012

I have no idea.

Best thing would be to test on a minimal example with an echo server to try to figure out why.

@ellisonbg
Member

Another thing we should think about - web sockets are not cross domain restricted like http Ajax calls. I think we want cross domain calls to work.

Sent from my iPad

On Sep 30, 2012, at 11:40 AM, Bussonnier Matthias notifications@github.com wrote:

I have no idea.

Best thing would be to test on a minimal example with an echo server to try to figure out why.


Reply to this email directly or view it on GitHub.

@Sesshomurai

But also not required. All the deployment scenarios we can think of will
want to maintain domain origination security between server and notebook
client.
Client should not use any promiscuous techniques like cross-domain.

On Sun, 2012-09-30 at 13:16 -0700, Brian E. Granger wrote:

Another thing we should think about - web sockets are not cross domain
restricted like http Ajax calls. I think we want cross domain calls to
work.

Sent from my iPad

On Sep 30, 2012, at 11:40 AM, Bussonnier Matthias
notifications@github.com wrote:

I have no idea.

Best thing would be to test on a minimal example with an echo server
to try to figure out why.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub.

@ellisonbg
Member

We do have some deployment cases where we would want to talk to kernels on
other domains.

On Sun, Sep 30, 2012 at 1:35 PM, Sesshomurai notifications@github.comwrote:

But also not required. All the deployment scenarios we can think of will
want to maintain domain origination security between server and notebook
client.
Client should not use any promiscuous techniques like cross-domain.

On Sun, 2012-09-30 at 13:16 -0700, Brian E. Granger wrote:

Another thing we should think about - web sockets are not cross domain
restricted like http Ajax calls. I think we want cross domain calls to
work.

Sent from my iPad

On Sep 30, 2012, at 11:40 AM, Bussonnier Matthias
notifications@github.com wrote:

I have no idea.

Best thing would be to test on a minimal example with an echo server
to try to figure out why.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHubhttps://github.com/ipython/ipython/pull/2321#issuecomment-9017651.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@Carreau
Member
Carreau commented Oct 1, 2012

Web browser never talk directly to kernel. Do they ?

Moreover, SockJs does not always use websocket. So cross domain is not always possible.
I wouldn't rely on smth that is not always possible.

@ellisonbg
Member

On Mon, Oct 1, 2012 at 12:27 PM, Bussonnier Matthias <
notifications@github.com> wrote:

Web browser never talk directly to kernel. Do they ?

No, they talk to the server through WS or SockJS which talks to the kernels
over ZeroMQ.

Moreover, SockJs does not always use websocket. So cross domain is not
always possible.
I wouldn't rely on smth that is not always possible.

This is my point exactly. I see situations where we do want cross domain
browser/server/kernel communications. Because of this, I think we need to
rethink if we really want to go with SockJS.


Reply to this email directly or view it on GitHubhttps://github.com/ipython/ipython/pull/2321#issuecomment-9046200.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@Carreau
Member
Carreau commented Oct 1, 2012

No, they talk to the server through WS or SockJS which talks to the kernels
over ZeroMQ.

[...]

This is my point exactly. I see situations where we do want cross domain
browser/server/kernel communications. Because of this, I think we need to
rethink if we really want to go with SockJS.

Do you mean that you want to have kernel that support authentications ?
If this is the case then we would have to split the current "server" into independent subparts.
The served html/js is really small. To dispatch users (for example) a 302 would be quite enough.
I don't see reasons not to acces kernel from the same domain from which you got the notebook.
I'm also sure nginx would support load balancing to multiple servers really well.

@jasongrout
Member

I'm not sure what the cross-origin discussion here is precisely, but I thought I'd throw out that we are using the kernel javascript to embed a number of single executable cells (each with a different kernel) into another webpage in the sage cell server project. Our setup looks like:

<client (may be embedded in a page from a different domain) > 
    |
  via SockJS multiplexed connection (all kernel channels are going through a single SockJS connection)
    | 
<tornado, which operates a number of SockJS/ZMQ bridges, possibly multiple for a single page>
    | 
  via the standard IPython ZMQ kernel channels
    |
< possibly a third domain actually running the kernels >
@ellisonbg
Member

On Mon, Oct 1, 2012 at 12:42 PM, Bussonnier Matthias <
notifications@github.com> wrote:

No, they talk to the server through WS or SockJS which talks to the kernels
over ZeroMQ.

[...]

This is my point exactly. I see situations where we do want cross domain
browser/server/kernel communications. Because of this, I think we need to
rethink if we really want to go with SockJS.

Do you mean that you want to have kernel that support authentications ?

Not only that. Here is more of what I mean. Let's say I am using a
notebook hosted on domain1.org. Maybe I have access to some crazy
multicore CPU at domain2.org, but I don't want to move my notebook document
over. It would be nice if you could simply point that notebook to a
private, authenticated WS URL on domain2.org to start and run your kernel
there. Maybe instead of domain2.org, you want to use localhost. I have
even had people say to me "give me kernel.js and a domain where I can
connect to to get kernels and I will write my own frontend."

The notebook document and kernel sides of the current notebook are mostly
independent and I want to keep this. Binding them to be same domain is too
restrictive in the long run if you ask me. But the proxy limitation of
WebSockets are also a problem, so I don't quite know what to do...

If this is the case then we would have to split the current "server" into
independent subparts.
The served html/js is really small. To dispatch users (for example) a 302
would be quite enough.
I don't see reasons not to acces kernel from the same domain from which
you got the notebook.
I'm also sure nginx would support load balancing to multiple servers
really well.


Reply to this email directly or view it on GitHubhttps://github.com/ipython/ipython/pull/2321#issuecomment-9046772.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@Carreau
Member
Carreau commented Oct 1, 2012

Our setup looks like: [...]

That would be what I would have done, but maybe with a layer on top to dispatch many user to several tornado server.

The only thing I see that would make brian to want cross domain would be a central authentication point, but dispatched kernel, and you might want to be able to connect to the kernel in the next room while still login into the main server, to avoid data going back and forth. But then I think dispatching a local slave server with remote authentication seem a better solution for me.

@Carreau
Member
Carreau commented Oct 1, 2012

Not only that. Here is more of what I mean. Let's say I am using a
notebook hosted on domain1.org. Maybe I have access to some crazy
multicore CPU at domain2.org, but I don't want to move my notebook document
over. It would be nice if you could simply point that notebook to a
private, authenticated WS URL on domain2.org to start and run your kernel
there.

You still need to run a websocket/zmq bridge on domain2.
The only thing you need a something like OAuth2 to allow domain2 to acces domain1 notebook.

"give me kernel.js and a domain where I can
connect to to get kernels and I will write my own frontend."

Writing your frontend, still need a server. If domain2 have OAuth, it just have to forward some request to domain2.

IMHO having kernel.js alone does not make sens, as you need lots of stuff like authentication, powering up and down...

Though, splitting the notebook server in smaller component might be to considere.

I'll still think about it.

@ellisonbg
Member

On Mon, Oct 1, 2012 at 1:14 PM, Bussonnier Matthias <
notifications@github.com> wrote:

Not only that. Here is more of what I mean. Let's say I am using a
notebook hosted on domain1.org. Maybe I have access to some crazy
multicore CPU at domain2.org, but I don't want to move my notebook
document
over. It would be nice if you could simply point that notebook to a
private, authenticated WS URL on domain2.org to start and run your kernel
there.

You still need to run a websocket/zmq bridge on domain2.
The only thing you need a something like OAuth2 to allow domain2 to acces
domain1 notebook.

"give me kernel.js and a domain where I can
connect to to get kernels and I will write my own frontend."

Writing your frontend, still need a server. If domain2 have OAuth, it just
have to forward some request to domain2.

Yes, both domains would need to be running a server and we would have to
figure out how to handle the auth in a good way.

IMHO having kernel.js alone does not make sens, as you need lots of stuff

like authentication, powering up and down...

Though, splitting the notebook server in smaller component might be to
considere.

Yes, for this to work, the RESTful API for starting, stopping, inter, etc.
kernels would have to be moved over to WebSockets.

I'll still think about it.

Great!


Reply to this email directly or view it on GitHubhttps://github.com/ipython/ipython/pull/2321#issuecomment-9047920.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@Sesshomurai

I think all the cross domain orchestration should be handled on the server and expose only http rest interface to all clients. Then its secure. Web friendly. Firewall friendly and support many client easily.

Sent from my Verizon Wireless 4G LTE Smartphone"Brian E. Granger" notifications@github.com wrote:We do have some deployment cases where we would want to talk to kernels on
other domains.

On Sun, Sep 30, 2012 at 1:35 PM, Sesshomurai notifications@github.comwrote:

But also not required. All the deployment scenarios we can think of will
want to maintain domain origination security between server and notebook
client.
Client should not use any promiscuous techniques like cross-domain.

On Sun, 2012-09-30 at 13:16 -0700, Brian E. Granger wrote:

Another thing we should think about - web sockets are not cross domain
restricted like http Ajax calls. I think we want cross domain calls to
work.

Sent from my iPad

On Sep 30, 2012, at 11:40 AM, Bussonnier Matthias
notifications@github.com wrote:

I have no idea.

Best thing would be to test on a minimal example with an echo server
to try to figure out why.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHubhttps://github.com/ipython/ipython/pull/2321#issuecomment-9017651.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

Reply to this email directly or view it on GitHub.

@ellisonbg
Member

That would require a very centralized approach to all of this that is not
desirable. Authentication for each domain will probably need to be handled
separately (not everyone with accounts on domain1 will have accounts on
domains 2) and we want a more decentralized/distributed architecture.
Think about git - you can pull changes from any git repo anywhere in the
world. Each of the git repos you interact with may be hosted in completely
different settings. But it all magically works together. It is secure,
web friendly, firewall friendly and supports many clients easily. That is
what I have in mind. I am not saying it will be easy, or that I know how
all of the details will work out, but I don't want to make decisions early
on that will outright prevent us from moving in this direction.

On Mon, Oct 1, 2012 at 1:40 PM, Sesshomurai notifications@github.comwrote:

I think all the cross domain orchestration should be handled on the server
and expose only http rest interface to all clients. Then its secure. Web
friendly. Firewall friendly and support many client easily.

Sent from my Verizon Wireless 4G LTE Smartphone"Brian E. Granger" <
notifications@github.com> wrote:We do have some deployment cases where we
would want to talk to kernels on
other domains.

On Sun, Sep 30, 2012 at 1:35 PM, Sesshomurai notifications@github.comwrote:

But also not required. All the deployment scenarios we can think of will
want to maintain domain origination security between server and notebook
client.
Client should not use any promiscuous techniques like cross-domain.

On Sun, 2012-09-30 at 13:16 -0700, Brian E. Granger wrote:

Another thing we should think about - web sockets are not cross domain
restricted like http Ajax calls. I think we want cross domain calls to
work.

Sent from my iPad

On Sep 30, 2012, at 11:40 AM, Bussonnier Matthias
notifications@github.com wrote:

I have no idea.

Best thing would be to test on a minimal example with an echo server
to try to figure out why.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<
https://github.com/ipython/ipython/pull/2321#issuecomment-9017651>.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHubhttps://github.com/ipython/ipython/pull/2321#issuecomment-9049016.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@Sesshomurai

I think it can be distributed and its definitely desirable, but what is
the use case for de-centralized, vis-a-vis notebooks?

In our idea for this, we already see that ipython/icontroller is a
centralized controller for parallel processing.
So keeping this notion and extending a web/notebook front end for
interacting with that centralized controller
to manipulate code/algorithms into a cluster is intuitive and doesn't
require a new model.

However, as the notebook server/client is currently written, it is
difficult to use through a firewall with common
domain/port origination rules, etc. Although our environment is rather
strict compared to others.

This is why we've suggested using the originating port and common
protocols (HTTP/REST/ajax etc) that are
more web-standard, cross-browser compatible, secure friendly etc. In
large enterprises (or even cross-enterprise),
it will be more desirable to limit port use and cross-domain access. But
in a smaller, more controllable environment,
I can see interesting use cases. Hopefully, its not a one-or-the-other
issue. I don't think they compete, its just
we suggest not coding in such a way that it prevents use in very secure
environments.

On Mon, 2012-10-01 at 14:18 -0700, Brian E. Granger wrote:

That would require a very centralized approach to all of this that is
not
desirable. Authentication for each domain will probably need to be
handled
separately (not everyone with accounts on domain1 will have accounts
on
domains 2) and we want a more decentralized/distributed architecture.
Think about git - you can pull changes from any git repo anywhere in
the
world. Each of the git repos you interact with may be hosted in
completely
different settings. But it all magically works together. It is
secure,
web friendly, firewall friendly and supports many clients easily. That
is
what I have in mind. I am not saying it will be easy, or that I know
how
all of the details will work out, but I don't want to make decisions
early
on that will outright prevent us from moving in this direction.

On Mon, Oct 1, 2012 at 1:40 PM, Sesshomurai
notifications@github.comwrote:

I think all the cross domain orchestration should be handled on the
server
and expose only http rest interface to all clients. Then its secure.
Web
friendly. Firewall friendly and support many client easily.

Sent from my Verizon Wireless 4G LTE Smartphone"Brian E. Granger" <
notifications@github.com> wrote:We do have some deployment cases
where we
would want to talk to kernels on
other domains.

On Sun, Sep 30, 2012 at 1:35 PM, Sesshomurai
notifications@github.comwrote:

But also not required. All the deployment scenarios we can think
of will
want to maintain domain origination security between server and
notebook
client.
Client should not use any promiscuous techniques like
cross-domain.

On Sun, 2012-09-30 at 13:16 -0700, Brian E. Granger wrote:

Another thing we should think about - web sockets are not cross
domain
restricted like http Ajax calls. I think we want cross domain
calls to
work.

Sent from my iPad

On Sep 30, 2012, at 11:40 AM, Bussonnier Matthias
notifications@github.com wrote:

I have no idea.

Best thing would be to test on a minimal example with an echo
server
to try to figure out why.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<
https://github.com/ipython/ipython/pull/2321#issuecomment-9017651>.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on
GitHubhttps://github.com/ipython/ipython/pull/2321#issuecomment-9049016.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com


Reply to this email directly or view it on GitHub.

@jasongrout
Member

On 10/1/12 3:05 PM, Bussonnier Matthias wrote:

Our setup looks like: [...]

That would be what I would have done, but maybe with a layer on top to
dispatch many user to several tornado server.

The only thing I see that would make brian to want cross domain would be
a central authentication point, but dispatched kernel, and you might
want to be able to connect to the kernel in the next room while still
login into the main server, to avoid data going back and forth. But then
I think dispatching a local slave server with remote authentication seem
a better solution for me.

In fact, we do run everything through HAProxy, and we've experimented
with load-balancing over several tornado servers (we don't have to deal
with authentication, but we do have to make sure that the load balancing
respects previous assignments). Right now, we have the tornado server
load balancing over several ssh accounts that actually manage the
kernels. In other words, we load balance on the second connection
above, but we've also experimented with load balancing on the first
connection above as well.

@Carreau
Member
Carreau commented Oct 2, 2012

I think there is still in lots of people mind (and often to myself also) the misconception that the browser is the frontend. But it is not the case. the current server is the frontend. the browser is at most a interface to view the frontend. And you should be one day able to have multiple 'views' (which is collaboration mode) and you could then close the browser, open another and everything should continue to work. Or you are on a laptop and your network connexion is cut. So the browser must not be a node in the network.

Seeing above comment and after some research on the internet, I think that proxying the request to different machines depending on the URL is feasible. Our architecture already allow to switch backend, it will allow to switch frontend, the way of spawning kernel should be configurable, the only thing to add is ability to have those services on remotes, potentially load balanced machines.

The way I see it we need to split the server in a "nano-proxy", Which redirect request to the right machine. And services. This "nano proxy" is the hub that "exposes" all the interfaces, and is connected to list of notebooks, machines ables to spawn kernel...

Adding localhost to available kernel should'nt be harder than lauching

ipython nanoproxy --kernels=localhost --notebooks=me@domain1.com --token af0ced346a

And go to to 127.0.0.1:8888 instead of domain1.

Same for domain2 where either you give domain2 token to read notebooks from domain1 and connect to domain2
or give domain1 the kernels token to domain2 and still connect to domain1.

If this is well design, you should not even be able to know wether domainX is a real endpoint or another nano-proxy.

@tkf
tkf commented Oct 2, 2012

Sorry to interrupt the discussion, but let me remind @minrk that I couln't make the SockJS's raw webcocket work as advertised, as my comment is far above now:
#2321 (comment)

I guess checking this might help when supporting cross domain as @ellisonbg said
#2321 (comment)

@Sesshomurai

Great explanation and ideas.




------- Original Message -------
On 10/2/2012 02:25 AM Bussonnier Matthias wrote:
I think there is still in lots of people mind (and often to myself also) the misconception that the browser is the frontend. But it is not the case. the current server is the frontend. the browser is at most a interface to view the frontend. And you should be one day able to have multiple 'views' (which is collaboration mode) and you could then close the browser, open another and everything should continue to work. Or you are on a laptop and your network connexion is cut. So the browser must not be a node in the network.



Seeing above comment and after some research on the internet, I think that proxying the request to different machines depending on the URL is feasible. Our architecture already allow to switch backend, it will allow to switch frontend, the way of spawning kernel should be configurable, the only thing to add is ability to have those services on remotes, potentially load balanced machines.



The way I see it we need to split the server in a "nano-proxy", Which redirect request to the right machine. And services. This "nano proxy" is the hub that "exposes" all the interfaces, and is connected to list of notebooks, machines ables to spawn kernel...



Adding localhost to available kernel should'nt be harder than lauching

<br>ipython nanoproxy --kernels=localhost --notebooks=me@domain1.com --token af0ced346a <br>

And go to to 127.0.0.1:8888 instead of domain1.



Same for domain2 where either you give domain2 token to read notebooks from domain1 and connect to domain2

or give domain1 the kernels token to domain2 and still connect to domain1.



If this is well design, you should not even be able to know wether domainX is a real endpoint or another nano-proxy.



---

Reply to this email directly or view it on GitHub:

#2321 (comment)

@tkf
tkf commented Oct 3, 2012

I just checked that you can connect to raw websockets from JS. I ran the following code:

from tornado import web, ioloop
from sockjs.tornado import SockJSRouter, SockJSConnection


class EchoConnection(SockJSConnection):
    def on_message(self, msg):
        self.send(msg)

if __name__ == '__main__':
    EchoRouter1 = SockJSRouter(EchoConnection, '/echo1')
    EchoRouter2 = SockJSRouter(EchoConnection, '/echo2')

    app = web.Application(EchoRouter1.urls + EchoRouter2.urls)
    app.listen(9999)
    ioloop.IOLoop.instance().start()
ws1 = new WebSocket('ws://localhost:9999/echo1/websocket');
ws2 = new WebSocket('ws://localhost:9999/echo2/websocket');
ws1.onmessage = function(e){ console.log(e.data); };
ws2.onmessage = function(e){ console.log(e.data); };
ws1.send("hey1");
ws2.send("hey2");

then "hey1" and "hey2" appeared in browser console.

So, I think there is something wrong in IPython notebook server and/or my JS code.

@Carreau
Member
Carreau commented Oct 4, 2012

@tkf.
Thanks for trying this.
This was one of the things I wanted to do.

You JS code looked fine to me. I'll try to investigate when I have more time.

@minrk
Member
minrk commented Oct 4, 2012

A tiny change to the notebook, which forces a raw WebSocket connection instead of SockJS, shows that it does indeed work as expected.

@ellisonbg
Member

OK this is good to know, thanks.

On Thu, Oct 4, 2012 at 1:47 PM, Min RK notifications@github.com wrote:

A tiny changehttps://github.com/minrk/ipython/commit/3d16fa7a107962279fc525dd39b3e56fa083377cto the notebook, which forces a raw WebSocket connection instead of SockJS,
shows that it does indeed work as expected.


Reply to this email directly or view it on GitHubhttps://github.com/ipython/ipython/pull/2321#issuecomment-9156541.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@tkf
tkf commented Oct 5, 2012

@minrk your change uses the combined channel, not the original two channels. Does it mean when using SockJS we can't use two raw channels? But it is OK to use two raw channels in the simple echo server. This is strange.

@minrk
Member
minrk commented Oct 5, 2012

Sorry, @tkf, I misunderstood. I can also confirm that it works with kernel.js from master (separate channels), with the minor tweaks necessary to the urls (.../iopub/websocket instead of .../iopub, and turn the http:// back into ws://).

@tkf
tkf commented Oct 5, 2012

Could you upload your code? I couldn't make it work.

@minrk
Member
minrk commented Oct 5, 2012

minrk/ipython@2c59943

It's just current sockjs branch, with kernel.js checked out from master, and a tiny fix to get use the new URLs.

@tkf
tkf commented Oct 5, 2012

I can still reproduce the problem I described above
#2321 (comment)

This is how I started IPython:

./ipython.py profile create plain
./ipython.py notebook --port 9999 --profile plain --notebook-dir docs/examples/notebooks/ --log-level=DEBUG

I tried in different port number to workaround browser cache, but the result was the same.

Some other info:

% git log -n1
commit 2c599432c72bba020011790dfbc83a3521859d0b
Author: MinRK <...>
Date:   Fri Oct 5 13:04:30 2012 -0700

    update websocket URLs

% google-chrome --version
Google Chrome 21.0.1180.79

% lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 11.10
Release:        11.10
Codename:       oneiric
@minrk
Member
minrk commented Oct 5, 2012

Apparently the issue was that SockJS doesn't route empty messages, and if you don't have a cookie (i.e. no password), then the handshake message is empty. I switched the send_cookie to send ' ' if there is no cookie, and that seems to resolve the issue.

@tkf can you confirm? (see my masterjs branch for the raw websockets instead of sockjs).

@tkf
tkf commented Oct 5, 2012

Yes, I can confirm that it works. Thank you very much!

@jenshnielsen jenshnielsen referenced this pull request in matplotlib/matplotlib Oct 23, 2012
Merged

WebAgg backend #1426

@Carreau
Member
Carreau commented Dec 4, 2012

this does not merge cleanly anymore...

As this is pretty big, I would suggest cleaning it up and merging as it only improve things.
(when @minrk have time of course, good luck BTW)

@minrk
Member
minrk commented Jan 15, 2013

rebased.

One issue that we may want to address, alluded to by @jasongrout, is the restriction that SockJS requires only one active connection per client. This will limit the ability to connect to multiple kernels from one webpage. The restriction only actually applies when the websocket transport is unavailable.

@ellisonbg
Member

@minrk do you know if SockJS allows regular expression matching is the
handler URLs? This is something I think we will need.

On Mon, Jan 14, 2013 at 4:00 PM, Min RK notifications@github.com wrote:

rebased.

One issue that we may want to address, alluded to by @jasongrouthttps://github.com/jasongrout,
is the restriction that SockJS requires only one active connection per
client. This will limit the ability to connect to multiple kernels from one
webpage. The restriction only actually applies when the websocket transport
is unavailable.


Reply to this email directly or view it on GitHubhttps://github.com/ipython/ipython/pull/2321#issuecomment-12246895.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@minrk
Member
minrk commented Jan 15, 2013

@ellisonbg I'm not sure what you mean. regex matching is already being used in the SockJS URLs.

@ellisonbg
Member

I looked at the code and I see that there is not native regular expression
handling, but you have added it in.

On Mon, Jan 14, 2013 at 7:19 PM, Min RK notifications@github.com wrote:

@ellisonbg https://github.com/ellisonbg I'm not sure what you mean.
regex matching is already being used in the SockJS URLs.


Reply to this email directly or view it on GitHubhttps://github.com/ipython/ipython/pull/2321#issuecomment-12251820.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@minrk
Member
minrk commented Jan 15, 2013

Ah, I hadn't remembered doing that.

minrk added some commits Aug 21, 2012
@minrk minrk drop-in sockjs e8c5bba
@minrk minrk add sockjs-tornado 6d75182
@minrk minrk update sockjs.tornado imports
for IPython.external imports
ea2db2b
@minrk minrk use SockJS instead of pure websockets 8dd2a49
@minrk minrk fix missed heartbeat message for sockjs d03c51c
@minrk minrk fix new cookie name for sockjs 98f1bec
@minrk minrk Add joined IOPubAndShellHandler
Now only one SockJS connection is needed in the browser for both channels.
The previous single-channel connections remain available.
ac261a9
@minrk minrk cookie message must not be empty
SockJS seems to ignore empty messages
8b7bae9
@minrk minrk update sockjs-tornado
to hash ed3ce990a71e9dea5f82c153429b051186062431
62faaaa
@minrk minrk import ioloop from zmq, not tornado
in sockjs-tornado

tornado >= 2.5 makes a minor change in the IOLoop API,
so tornado.ioloop.PeriodicCallback >= 2.5 will not work with
IOLoop from pyzmq < 3.2.
c49bce1
@minrk
Member
minrk commented Jan 18, 2013

rebased

@minrk
Member
minrk commented Jan 19, 2013

I'm going to see if I can integrate the Sage MultiSockJS code, so that this will work with multiple kernels on a single page. I don't like how it changes the wire format from {JSON} to kernel_id/channel,{JSON},
but I don't really see a good way around that. @ellisonbg what do you think about that?

@ellisonbg
Member

I really don't like the idea of taking things that belong in kernel related URLs (kernel_id, username, etc) and putting them into the message format. It breaks the main abstraction of the web, which is that resources are addressable using URLs. Some problems with this:

  1. As we have more complex URLs, we have to add those things too to the message spec. If we have usernames in the URLs, those have to be moved to the message spec. These things just don't belong there.
  2. It makes these channels much more vulnerable to certain kinds of attack. If someone tries to get a WebSocket connection to /kernels/kid/iopub for a kid that doesn't exist, the server returns a simple 404 - the protocol upgrade to WebSocket never occurs and the client doesn't have a chance to start sending WebSocket messages. This means that you have to know what a valid kid is to even get a WebSocket connection. This is not security, but it minimizes the attack surface. If/when we add usernames to our URLs /ellisonbg/kernels/kid/iopub, that attack surface decreases even more - someone has to know my username and my particular kid to get a WebSocket connection.

This also relates to the inability to do native URL regular expression in SockJS - you have to make the full WebSocket connection and then do your own URL parsing.

I really want to like SockJS, but I am not excited about the changes it forces us to make. But maybe we should IRC about this, as it is pretty important.

@minrk
Member
minrk commented Jan 19, 2013

Based on IRC discussion, I won't do the MultiSockJS work. As proxy support for WebSockets improves (HA Proxy and node-http-proxy so far, nginx sort of / soon), the case for SockJS diminishes.

I will leave this here for now, but at current pace, I expect WebSocket support to improve quickly enough that we won't need to make the compromises that SockJS would force upon us.

This was just an experiment to see how easy integrating SockJS would be, anyhow.

@jasongrout
Member

Just curious: is the IRC discussion archived anywhere?

I too see SockJS as a temporary solution (in fact, the premise behind SockJS is that it is a temporary solution until websocket support matures). But it is an important hole for us to fill, as we are supporting legacy clients.

@ellisonbg
Member

We are less worried about legacy clients - by now all browsers have very good WebSocket support. You could argue that some people still use older browsers, but it is safe to say that those users are not our target market. The main area where we have run into problems with WebSockets is in the proxy support. But this is improved quicklyi and there are already some really good solutions. The only issue right now is that it forces people to choose a supported proxy, rather than letting them choose their own.

@jasongrout
Member

On our end, we should record how many times sockjs uses a websocket connection vs. some other method.

@minrk
Member
minrk commented Jan 19, 2013

The gist of the IRC discussion was:

drawbacks of SockJS:

  • we can't use url-based scheme, because we have to serve everything via one SockJS connection
  • we have to cram things into the message spec that we don't think belong there (kernel, channel)
  • some security/performance issues associated with cramming all channels in one connection.

drawbacks of WebSockets:

  • don't work on old browsers (don't much care - even current IE works)
  • don't work in various proxy environments

The drawbacks of SockJS aren't going to change, and the drawbacks of WebSockets are only getting smaller. The decision to go with SockJS is really one of: do we need to work in a proxy environment before the necessary proxy environment adds support for WebSockets. Six months ago, I thought the answer was probably yes. Now, I think it's probably no.

@ellisonbg
Member

Should we close this and open an issue to track the it. That will allow us to keep our open PRs on work that is actively moving foward.

@minrk minrk referenced this pull request Jan 21, 2013
Closed

Maybe use SockJS #2822

@minrk
Member
minrk commented Jan 21, 2013

Sure. Closing here and opening as #2822

@minrk minrk closed this Jan 21, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment