Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.9.12: Possible leak of open files #1835

Closed
hroncok opened this issue Jul 7, 2021 · 3 comments · Fixed by #1840
Closed

0.9.12: Possible leak of open files #1835

hroncok opened this issue Jul 7, 2021 · 3 comments · Fixed by #1840

Comments

@hroncok
Copy link
Contributor

hroncok commented Jul 7, 2021

In Fedora, when I build the proposed 0.9.12 version of Pythran, we run the tests with pytest-xdist. When the number of test runner processes (the -n argument of pytest-xdist) is low (1, 2 or 3), almost all the tests eventually fail with OSError: [Errno 24] Too many open files.

One failure typically looks like this:

______________ TestTyping.test_typing_aliasing_and_combiner_back _______________
[gw0] linux -- Python 3.10.0 /usr/bin/python3
self = <pythran.tests.test_typing.TestTyping testMethod=test_typing_aliasing_and_combiner_back>
    def test_typing_aliasing_and_combiner_back(self):
>       self.run_test('def typing_aliasing_and_combiner_back(i): d=set();e=set(); f = e or d; e.add(i); return d,e,f', 116, typing_aliasing_and_combiner_back=[int])
pythran/tests/test_typing.py:96: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pythran/tests/__init__.py:309: in run_test
    cxx_compiled = compile_pythrancode(
pythran/toolchain.py:414: in compile_pythrancode
    output_file = compile_cxxcode(module_name,
pythran/toolchain.py:353: in compile_cxxcode
    fdpath = _write_temp(cxxcode, '.cpp')
pythran/toolchain.py:76: in _write_temp
    with NamedTemporaryFile(mode='w', suffix=suffix, delete=False) as out:
/usr/lib64/python3.10/tempfile.py:549: in NamedTemporaryFile
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
dir = '/tmp', pre = 'tmp', suf = '.cpp', flags = 131266
output_type = <class 'str'>
>   ???
E   OSError: [Errno 24] Too many open files: '/tmp/tmpzpm4n2ix.cpp'
/usr/lib64/python3.10/tempfile.py:252: OSError

A typical progress looks like this (for 1 test runner process):

....................................s................................... [  2%]
...............................................s........................ [  4%]
........................................................................ [  6%]
........................................................................ [  8%]
.............................................s.....................s.... [ 10%]
...........................s...s........................................ [ 12%]
.......................s................................................ [ 14%]
..................................................................s.s..s [ 16%]
....ss.......s.s...ss.s...s............................................. [ 18%]
........................................................................ [ 20%]
.....................................s...s.............................. [ 22%]
........................................................................ [ 24%]
........................................................................ [ 27%]
..............ss......sss.FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 29%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 31%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF.FFFFFFF.FFFFFFFFFFFFFFFFFF [ 33%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 35%]
FFFFFFFFFFFFFFFFFsFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFsFFFFFFFFFFFFFFFF [ 37%]
FssFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 39%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 41%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 43%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFsssFFFFFFFFF [ 45%]
FFssFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 47%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 49%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 51%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 54%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 56%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFsFFFF [ 58%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 60%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 62%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 64%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 66%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 68%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 70%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 72%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 74%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 76%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 79%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 81%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 83%]
FFFFFFFFFFFFFF.FF.......FFFFFF....F........F.FFFFFFFF.FF.FFFF.FFFFFFFFFF [ 85%]
FFFFFFFFsFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE [ 87%]
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE [ 89%]
EFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 91%]
FFFFFFFFFFFFFFFFFFFFFFFFFFF...........F.FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 93%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFss.........FFF..FFFFFFFFFFsFFFFF. [ 95%]
FF........ssFFFFFFFFFFFFF.FEEEEEEEEEEEE..FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 97%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 99%]
FFFFFFF                                                                  [100%]

For more processes, the failures appear later, e.g. this is for 3 processes:

.................................s...................................... [  2%]
................................................s....................... [  4%]
........................................................................ [  6%]
........................................................................ [  8%]
.................................s...................................s.. [ 10%]
......s...................s............................................. [ 12%]
...........................................s..........................s. [ 14%]
...................................s....s....s.....s.s.................. [ 16%]
.............................................................s..s....... [ 18%]
..s.................s...........s........s.............................. [ 20%]
........................................................................ [ 22%]
................................................s....................... [ 24%]
........................................................................ [ 27%]
........................................................................ [ 29%]
.......................ss.....................sss....................... [ 31%]
........................................................................ [ 33%]
........................................................................ [ 35%]
........................................................................ [ 37%]
.....s.................................sss........................ss.... [ 39%]
........................................................................ [ 41%]
........................................................................ [ 43%]
........................................................................ [ 45%]
...........................s............................................ [ 47%]
........................................................................ [ 49%]
...............................ss....................................... [ 51%]
...................................................s.................... [ 54%]
........................................................................ [ 56%]
........................................................................ [ 58%]
........................................................................ [ 60%]
........................................................................ [ 62%]
........................................................................ [ 64%]
........................................................................ [ 66%]
......................................FFF.FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 68%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF.FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 70%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF.FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 72%]
FFFFFFFFFFFFFF.FFFFFFFFFFFFFFFFFFFFFFFFFFFF.FFFFFFFFFFFFFFFFFFFFFFF.FFFF [ 74%]
FFFFFFFFFFFFFFFFF.FFFFFFFFFFFFFFFFFFFFFFFFFF.FFFFFFFFFFFFFFF.FFFFFFFFFFF [ 76%]
FFFF.FF........FFFFFF....F........F..FFFFFFFF.FF.FFF.F.FFFFFFFFFFFFFFFFF [ 79%]
FsFFF.FFFFFF.FFFFFFFFFFFFFFFFFFFFEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE [ 81%]
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEFFFF [ 83%]
FFFFFFF.FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF.FFFFFFFFFFFFFFFFFFFF [ 85%]
FFFFFFFFFFFFFF..FFFFFFFFFF...........F.FFFFFFFFFFFFF.FFFFFFFFFFFFFFFFFFF [ 87%]
FFFFFFFFFFFFFFFFFFF.FFFFFFFFFFFFFF.FFFFFFss.........FFF..FFFFFFFFFFsFFFF [ 89%]
F..FF.........ssFFFFFFFFFFFFF.FEEEEEEEEEEEE..FFFFFFFFFFFFFFFFFFFFFFFFFFF [ 91%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF.FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 93%]
FFFFFFFFFFFF............................................................ [ 95%]
........................................................................ [ 97%]
........................................................................ [ 99%]
.......                                                                  [100%]

When the number of test processes is higher (e.g. 6 or 8), the problem is not observable. Likely, this means that the number of open files raises when one Python process calls pythran (or some parts of the test) repeatedly.

The limit for open files is set to 1024. Setting it to 2048 (via ulimit -n 2048) makes the errors go away and the tests pass.

@hroncok
Copy link
Contributor Author

hroncok commented Jul 7, 2021

I've also triggered a test build on Fedora 34 with Python 3.9 to see if this is Python 3.10 related. EDIT: That requires updated gast and beniget, so no dice for now. Can try later in Copr if the cause is not waht it seems to be.


In the meantime, it seem that the traceback has a reference to a possible source of the leak: NamedTemporaryFile(..., delete=False). EDIT: It's just a line that tries to open another file when the limit is already exceeded, usually a "random" one.

@hroncok
Copy link
Contributor Author

hroncok commented Jul 7, 2021

def compile_cxxcode(module_name, cxxcode, output_binary=None, keep_temp=False,
**kwargs):
'''c++ code (string) -> temporary file -> native module.
Returns the generated .so.
'''
# Get a temporary C++ file to compile
fdpath = _write_temp(cxxcode, '.cpp')
output_binary = compile_cxxfile(module_name, fdpath,
output_binary, **kwargs)
if not keep_temp:
# remove tempfile
os.remove(fdpath)
else:
logger.warning("Keeping temporary generated file:" + fdpath)
return output_binary

What happens here if compile_cxxfile raises an exception? The file is never deleted, regardless of keep_temp value. In that case, if the tests test thousands of various combinations of expected failures, the open file limit seem to be easily reached. I'll see what happens if the os.remove() call is moved to finally:.

@hroncok
Copy link
Contributor Author

hroncok commented Jul 7, 2021

I seem to be confusing closing with deleting here. It's getting late and I am getting clumsy. Will get back to this tomorrow (or later).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant