New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Fix segfault when an error occurs in np.fromfile #12354
BUG: Fix segfault when an error occurs in np.fromfile #12354
Conversation
numpy/core/tests/test_multiarray.py
Outdated
with open(self.filename, 'rb') as f: | ||
assert_raises(ValueError, np.fromfile, f, np.unicode) | ||
|
||
assert_raises(ValueError, np.fromfile, self.filename, np.unicode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test isn't quite the same as the case reported in the issue.
The reason here is that on some platforms an overcommitted allocation could be possible, by attempting a read of invalid unicode data we get a test which is not flaky.
I've checked that this test will also trigger a segfault (on Python 3.6 - Ubuntu 18.04) without the fix proposed in this PR.
* we need to clear it, and restore it later to ensure that | ||
* we can cleanup the duplicated file descriptor properly. | ||
*/ | ||
PyErr_Fetch(&err_ptype, &err_pvalue, &err_ptraceback); | ||
if (npy_PyFile_DupClose2(file, fp, orig_pos) < 0) { | ||
goto fail; | ||
} | ||
if (own && npy_PyFile_CloseFile(file) < 0) { | ||
goto fail; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be neat to call _PyErr_ChainExceptions
here, although you'll need to add a backport of that in numpy\core\include\numpy\npy_3kcompat.h
(almost identical to npy_PyErr_ChainExceptionsCause
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is part of the C-API that I'm not really familiar with - npy_PyErr_ChainExceptionsCause
looks like it's almost identical to _PyErr_ChainExceptions
.
What are the differences between them and why would I want to use one over the other?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_PyErr_ChainExceptions
isn't available in python 2.7, which is why you can't use it directly.
_PyErr_ChainExceptions
gives the behavior of
>>> try:
raise ValueError
except:
raise KeyError
Traceback (most recent call last):
File "<pyshell#5>", line 2, in <module>
raise ValueError
ValueError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<pyshell#5>", line 4, in <module>
raise KeyError
KeyError
While npy_PyErr_ChainExceptionsCause
gives the behavior of:
>>> try:
raise ValueError
except Exception as e:
raise KeyError from e
Traceback (most recent call last):
File "<pyshell#7>", line 2, in <module>
raise ValueError
ValueError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<pyshell#7>", line 4, in <module>
raise KeyError from e
KeyError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation, I've added the implementation of that and have used it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix looks obviously correct, thanks. I wonder how many other cases we have of a C-side finally
statement written poorly...
d70355a
to
6a29778
Compare
if (PyErr_Occurred()) { | ||
/* only py3 supports this anyway */ | ||
#ifdef NPY_PY3K | ||
PyObject *exc2, *val2, *tb2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation has been taken from the cpython source: https://github.com/python/cpython/blob/50b48572d9a90c5bb36e2bef6179548ea927a35a/Python/errors.c#L389-L414
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks!
I assume you tested the exception chaining locally - I don't think it's worth adding a test for, but if you can paste a traceback in a comment here showing the result, that would be great!
I think this will produce the output I want to see, but it's way too dangerous to put into our unit tests: import builtins
import numpy as np
# make sure that the `close` member of files fails.
old_open = builtins.open
def bad_open(*args, **kwargs):
f = old_open(*args, **kwargs)
def bad_close():
f.close()
raise RuntimeError
f.close = bad_close
return f
builtins.open = bad_open
np.fromfile(fname, dtype=object) |
I'm not entirely sure that you were aiming for a
|
There is a problem with the way in which we handle errors which occur in the call to `PyArray_FromFile` in `np.fromfile`. The problem here is twofold. 1. The return value isn't checked, therefore if we reach the fail block, we will attempt a DECREF on a NULL and go down in flames. 2. The cleanup code on the filepointers (most notabily the call to `npy_PyFile_DupClose2`) assumes that there is no error set to work. This PR addresses these issues 1. By adding a NULL check to the fail block to ensure we don't attempt a DECREF on a NULL. 2. By saving the error state before attempting the cleanup code on the file descriptor, and then restoring it after. Fixes: numpy#12300
8f9e5aa
to
fbdcb5b
Compare
You're right, I was not aiming for that - but I suppose the output shows what I wanted anyway! |
It's not clear to me that this actually fixes #12300, but it's clear that this PR fixes something, so lets put it in. |
(wrong button, oops) |
Thanks for the fix! |
There is a problem with the way in which we handle
errors which occur in the call to
PyArray_FromFile
in
np.fromfile
.The problem here is twofold.
The return value isn't checked, therefore if we reach
the fail block, we will attempt a DECREF on a NULL and
go down in flames.
The cleanup code on the filepointers (most notabily the
call to
npy_PyFile_DupClose2
) assumes that there is noerror set to work.
This PR addresses these issues
By adding a NULL check to the fail block to ensure we don't
attempt a DECREF on a NULL.
By saving the error state before attempting the cleanup code
on the file descriptor, and then restoring it after.
Fixes: #12300