Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shared ssh connections linger on MacOS #26

Closed
andre-merzky opened this issue Mar 1, 2013 · 2 comments
Closed

shared ssh connections linger on MacOS #26

andre-merzky opened this issue Mar 1, 2013 · 2 comments

Comments

@andre-merzky
Copy link
Member

which leads to::

"Shared connection to india.futuregrid.org closed."

exceptions if the connection limit (10 on old Macs) is saturated. In general, it would be good to be able to have multiple master connections per host...

@ghost ghost assigned andre-merzky Mar 1, 2013
@andre-merzky
Copy link
Member Author

This seems to be a garbage collection issue. To keep resource utilization low, I already implemented a timeout garbage collector (saga/utils/timeout_gc.py), which will tear down idle ssh connections (timeout defaults to 60 seconds). That is supposed to prevent lingering ssh connections on the target host. The connection will be automatically re-established on the next activity.

When running the test suite, we are opening ssh connections to the target host repeatedly. It seems to me that the python garbage collection is not calling the job.Job and job.Service destructors in time -- which should kill the connections for good. Thus, the number of connections adds up, and eventually you are running out of resources (PTYs and connections it seems). As the resource limits are lower on MacOS (10 connections vs. 20 on Linux, by default), you see connection failures after a couple of tests.

There are two options: manually delete the job.Job and job.Service instances after each test, to get rid of the old sessions; or slowing down the tests. The second sounds, uhm, ugly, so lets try to add active closing for the connection for now.

Well, the third option of course is that I screwed up the connection-tear-down on MacOS, which would not be surprising, ey? ;-)

I'll be closing the other two tickets on pty problem, as I think those are duplicates of this one. I improved the error messages for those respective failures.

@oleweidner
Copy link
Contributor

fixed in devel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants