container exec leaks file handles #209

Closed
CuriousBadger277 opened this Issue Dec 9, 2016 · 2 comments

Comments

Projects
None yet
3 participants

Hi,

First of all, thanks for the great library implementation!

I'm having issues with container exec. I want to use this to manage a moderately large fleet of containers and found that after running exec enough times you run out of FDs.

I'm using the following versions (via pip) which I believe to be up to date:

 ws4py==0.3.4
 requests==2.11.1
 requests-ntlm==0.3.0
 requests-unixsocket==0.1.5
 pylxd==2.2.1

To reproduce run this script in terminal 1 (may need minor fixups):

#!/usr/bin/env python
import pylxd
c = pylxd.Client()
o = c.containers.get('my-container')
while True:
   o.execute(['/bin/ls']) # or some other innocuous command

If you leave this long enough you'll hit the process limit on file descriptors, alternatively you can open a second terminal and watch the fd list grow from proc:

$ ps -e | grep python # Make a note of the PID
$ cd /proc/sys/INSERT_PID_HERE/fd
$ watch -n1 sh -c 'ls -l | wc -l'

I've tracked this down to what I believe are the websockets in the execute function:

    # container.py:226
    manager = WebSocketManager()

    stdin = _StdinWebsocket(self.client.websocket_url)
    stdin.resource = '{}?secret={}'.format(parsed.path, fds['0'])
    stdin.connect()
    stdout = _CommandWebsocketClient(manager, self.client.websocket_url)
    stdout.resource = '{}?secret={}'.format(parsed.path, fds['1'])
    stdout.connect()
    stderr = _CommandWebsocketClient(manager, self.client.websocket_url)
    stderr.resource = '{}?secret={}'.format(parsed.path, fds['2'])
    stderr.connect()

    manager.start()

    while len(manager.websockets.values()) > 0:
        time.sleep(.1)

    operation = self.client.operations.get(operation_id)
    return _ContainerExecuteResult(
    operation.metadata['return'], stdout.data, stderr.data)

At this point I gave up due to time constraints but my best guess is it's because the manager runs its own thread and so needs to be released manually using manager.close() ([http://ws4py.readthedocs.io/en/latest/sources/ws4py/#ws4py.manager.WebSocketManager]).

Hi,

I noticed that it does not only leaks file handles, but also Threads, which might lead your program to reach the maximum number of allows threads, eg if using container.execute() a lot.

A fix seems to do as indicated in the ws4py doc, after the while loop:
http://ws4py.readthedocs.io/en/latest/sources/managertutorial/

manager.close_all()
manager.stop()
manager.join()

@rockstar rockstar closed this Jan 26, 2017

Link to the matching pull request for reference #211 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment