Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in utils/factory:ssh_connect #135

Closed
yottatsa opened this issue Jun 25, 2014 · 12 comments
Closed

Race condition in utils/factory:ssh_connect #135

yottatsa opened this issue Jun 25, 2014 · 12 comments

Comments

@yottatsa
Copy link

I've found out that simultaneous rpyc.classic.ssh_connect calls from different instances could possibly lead to race condition in https://github.com/tomerfiliba/rpyc/blob/master/rpyc/utils/factory.py#L159 .

Due to non-atomic operations of port discovery and port bind it could be possible to connect to another tunnel. ssh only warns you about port has been already used:

bind: Address already in use
channel_setup_fwd_listener: cannot listen to port: 2222

Since ssh does not accept 0 as a bind port, I have no idea how to fix it easy.

@tomerfiliba
Copy link
Collaborator

hrrrm, yes, that is a race. perhaps i should add a small (10ms) sleep.


Tomer Filiba
tomerfiliba.com http://www.facebook.com/tomerfiliba
http://il.linkedin.com/in/tomerfiliba

On Wed, Jun 25, 2014 at 12:53 PM, Vladimir Eremin notifications@github.com
wrote:

I've found out that simultaneous rpyc.classic.ssh_connect calls from
different instances could possibly lead to race condition in
https://github.com/tomerfiliba/rpyc/blob/master/rpyc/utils/factory.py#L159
.

Due to non-atomic operations of port discovery and port bind it could be
possible to connect to another tunnel. ssh only warns you about port has
been already used:

bind: Address already in use
channel_setup_fwd_listener: cannot listen to port: 2222

Since ssh does not accept 0 as a bind port, I have no idea how to fix it
easy.


Reply to this email directly or view it on GitHub
#135.

@yottatsa
Copy link
Author

It will be nice if you:

  • mark this call unsafe and prompt users to use locks, and
  • add error handling on situation I've shown on initial message

@tomerfiliba
Copy link
Collaborator

oh, you mean the race is due to multiple threads? than that's easy to solve.
i thought you meant the race was due to external processes opening sockets
and getting the same port number

-tomer


Tomer Filiba
tomerfiliba.com http://www.facebook.com/tomerfiliba
http://il.linkedin.com/in/tomerfiliba

On Wed, Jun 25, 2014 at 1:26 PM, Vladimir Eremin notifications@github.com
wrote:

It will be nice if you:

  • mark this call unsafe and prompt users to use locks, and
  • add error handling on situation I've shown on initial message


Reply to this email directly or view it on GitHub
#135 (comment).

@yottatsa
Copy link
Author

Not multiple threads, but multiple processes.

Your fix covers 50% cases. But for multiple independent processes it is not working and user should synchronise rpyc.classic.ssh_connect by themselves, with their business logic. So it will be nice to write some about this in documentation.

@tomerfiliba
Copy link
Collaborator

but there's nothing i could do there -- it's not like they can synchronize
with a global semaphore or something, because any process calling
bind() or connect (without binding first) could grab that port number. it's
an inherent race condition in BSD sockets... there's no way for me to
"reserve" a port for someone else, and sshd won't tell me which port it had
chosen.

-tomer


Tomer Filiba
tomerfiliba.com http://www.facebook.com/tomerfiliba
http://il.linkedin.com/in/tomerfiliba

On Wed, Jun 25, 2014 at 2:25 PM, Vladimir Eremin notifications@github.com
wrote:

Not multiple threads, but multiple processes.

Your fix cover 50% cases. But for multiple independent processes it is not
working and user should synchronise rpyc.classic.ssh_connect by
themselves, with their business logic. So it will be nice to write some
about this in documentation.


Reply to this email directly or view it on GitHub
#135 (comment).

@yottatsa
Copy link
Author

So I offer you to just say it in documentations, e.g.:

rpyc.classic.ssh_connect is unsafe. Use synchronisation there.

@tomerfiliba
Copy link
Collaborator

but the thing is, you cannot synchronize it. any process could get in the
way. there's no safe way to do it, without altering the kernel.


Tomer Filiba
tomerfiliba.com http://www.facebook.com/tomerfiliba
http://il.linkedin.com/in/tomerfiliba

On Wed, Jun 25, 2014 at 3:24 PM, Vladimir Eremin notifications@github.com
wrote:

So I offer you to just say it in documentations, e.g.:

rpyc.classic.ssh_connect is unsafe. Use synchronisation there.


Reply to this email directly or view it on GitHub
#135 (comment).

tomerfiliba added a commit that referenced this issue Jun 25, 2014
@yottatsa
Copy link
Author

Why? I can synchronise it in my app like that (or any other way using file-based inter-process synchronisation).

import filelock.FileLock
...
with filelock.FileLock("/var/lock/mymanagementapp/ssh.lock"):
    client = rpyc.classic.ssh_connect(...)

@tomerfiliba
Copy link
Collaborator

look at this code:

import socket
s = socket.socket()
s.connect(("www.google.com", 80))

since i called connect without bind first, the socket layer does it for
us an chooses a free local port.

rpyc chooses a random port, no matter how, and passes it to ssh. in this
time frame, from rpyc choosing a port and ssh binding it, any process may
bind to it and there's nothing you can do about it. it's how the socket
layer works: a port is either bound or free, so rpyc can't "reserve" a port
for ssh.

i hope i'm getting the message clear this time: any process might grab
this port, no matter what synchronization mechanisms you use. the fact that
you sync your own processes with a lock file is nice, but you can't protect
it from other processes that are not aware of that lock file.

-tomer


Tomer Filiba
tomerfiliba.com http://www.facebook.com/tomerfiliba
http://il.linkedin.com/in/tomerfiliba

On Wed, Jun 25, 2014 at 3:57 PM, Vladimir Eremin notifications@github.com
wrote:

Why? I can synchronise it in my app like that (or any other way using
file-based inter-process synchronisation).

import filelock.FileLock
...
with filelock.FileLock("/var/lock/mymanagementapp/ssh.lock"):
client = rpyc.classic.ssh_connect(...)


Reply to this email directly or view it on GitHub
#135 (comment).

@yottatsa
Copy link
Author

Actually, yes. But you’ve got an exception when you’ll try to connect to some service except rpyc.

I’ve got a case when I connect to wrong server via rpyc. Synchronisation in business logic + exception handling solve 90% of problems.

BTW, the safest mechanism is using of native protocol.

On 25 Jun 2014, at 17:06, Tomer Filiba notifications@github.com wrote:

look at this code:

import socket
s = socket.socket()
s.connect(("www.google.com", 80))

since i called connect without bind first, the socket layer does it for
us an chooses a free local port.

rpyc chooses a random port, no matter how, and passes it to ssh. in this
time frame, from rpyc choosing a port and ssh binding it, any process may
bind to it and there's nothing you can do about it. it's how the socket
layer works: a port is either bound or free, so rpyc can't "reserve" a port
for ssh.

i hope i'm getting the message clear this time: any process might grab
this port, no matter what synchronization mechanisms you use. the fact that
you sync your own processes with a lock file is nice, but you can't protect
it from other processes that are not aware of that lock file.

-tomer


Tomer Filiba
tomerfiliba.com http://www.facebook.com/tomerfiliba
http://il.linkedin.com/in/tomerfiliba

On Wed, Jun 25, 2014 at 3:57 PM, Vladimir Eremin notifications@github.com
wrote:

Why? I can synchronise it in my app like that (or any other way using
file-based inter-process synchronisation).

import filelock.FileLock
...
with filelock.FileLock("/var/lock/mymanagementapp/ssh.lock"):
client = rpyc.classic.ssh_connect(...)


Reply to this email directly or view it on GitHub
#135 (comment).


Reply to this email directly or view it on GitHub.

@orivej
Copy link

orivej commented Jun 3, 2015

Starting ssh with -oExitOnForwardFailure=yes sidestaps the race condition because then ssh exits if the port is already bound. After that its exit can be noted (altough I'm not sure on the best way to differentiate this exit reason from other ones) and port acquisition restarted.

@yottatsa
Copy link
Author

yottatsa commented Jun 3, 2015

Perfect!

On 03 Jun 2015, at 03:56, Orivej Desh notifications@github.com wrote:

Starting ssh with -oExitOnForwardFailure=yes sidestaps the race condition because then ssh exits if the port is already bound. After that its exit can be noted (altough I'm not sure on the best way to differentiate this exit reason from other ones) and port acquisition restarted.


Reply to this email directly or view it on GitHub #135 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants