Fix segfault on repr(OOBucket([... non-Latin-1 thingies ...])) #108

mgedmin · 2019-07-29T13:13:46Z

Fixes #106.

The bug was in not checking the return value of PyUnicode_AsLatin1String() and passing a NULL to PyOS_snprintf().

The fix uses PyUnicode_FromFormat() and avoids all the hassle of manual encoding/sprintf/decoding and even calling repr().

I've also added a regression test and tweaked tox.ini to set PYTHONFAULTHANDLER=1 to make the segfault visible -- without that all you see is the test runner exiting unexpectedly with no additional output (and you have to interpret "exited with code -11" as died from a SIGSEGV yourself).

The test fails on Python 3 only, and on non-pure builds only. I added PYTHONFAULTHANDLER=1 to the environment to make the failure less mysteriously silent. I've also updated tox.ini to use extras=foo instead of deps=.[foo] and to use factors to set PURE_PYTHON=1 instead of defining an entire environment. This now lets you run arbitrary pure Python builds like tox -e py37-pure.

Fixes #106. The bug was in not checking the return value of PyUnicode_AsLatin1String() and passing a NULL to PyOS_snprintf(). The fix is to use PyUnicode_FromFormat() and avoid all the hassle of manual encoding/sprintf/decoding and even calling repr().

jamadden

I think it looks good.

Probably needs a change note.

I wonder if it might be more clear at this point to put the #ifdef PY3K around the entire definition of bucket_repr, since there's essentially one line of shared code now. That would save the large stack allocation. (PyObject_Repr deals gracefully with a NULL input, but sadly PyUnicode_FromFormat("%R") does not, so the check on bucket_items is still necessary.)

#ifdef PY3K
static PyObject * bucket_repr(Bucket *self) 
{
   PyObject *i, *r;
   i = bucket_items(self, NULL, NULL);
   if (!i) 
        return NULL;
   r = PyUnicode_FromFormat(...)
   Py_DECREF(i)
   return r;
}
#else 
static PyObject* bucket_repr(Bucket *self)
{
   PyObject *i, *r;
   char repr[10000];
   int rv;
   i = bucket_items(self, NULL, NULL);
   if (!i) 
        return NULL;
   // complicated code here
}
#endif

mgedmin · 2019-07-29T14:25:02Z

The large stack allocation is inside a #ifndef PY3K, so it should already be saved.

Splitting the entire definition would probably improve readability. (OTOH I can't wait for Jan 2020 to roll around so I can submit a PR ripping out Python 2 compatibility and all the ifdefs.)

Good catch on the changelog note!

mgedmin added 2 commits July 29, 2019 15:39

mgedmin requested review from tseaver and jamadden July 29, 2019 13:15

jamadden approved these changes Jul 29, 2019

View reviewed changes

Add a changelog note

05df5d0

tseaver approved these changes Jul 29, 2019

View reviewed changes

mgedmin merged commit 44fd158 into master Jul 30, 2019

mgedmin deleted the segfault-on-unicode branch July 30, 2019 08:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix segfault on repr(OOBucket([... non-Latin-1 thingies ...])) #108

Fix segfault on repr(OOBucket([... non-Latin-1 thingies ...])) #108

mgedmin commented Jul 29, 2019

jamadden left a comment

mgedmin commented Jul 29, 2019

Fix segfault on repr(OOBucket([... non-Latin-1 thingies ...])) #108

Fix segfault on repr(OOBucket([... non-Latin-1 thingies ...])) #108

Conversation

mgedmin commented Jul 29, 2019

jamadden left a comment

Choose a reason for hiding this comment

mgedmin commented Jul 29, 2019