Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Missing return in __Pyx_UnicodeContainsUCS4 with PEP393 #3925

Closed
jmd-dk opened this issue Dec 1, 2020 · 4 comments · Fixed by #4135
Closed

[BUG] Missing return in __Pyx_UnicodeContainsUCS4 with PEP393 #3925

jmd-dk opened this issue Dec 1, 2020 · 4 comments · Fixed by #4135

Comments

@jmd-dk
Copy link
Contributor

jmd-dk commented Dec 1, 2020

Describe the bug
In Cython/Utility/StringTools.c, the function __Pyx_UnicodeContainsUCS4() is missing a return value in the case of CYTHON_PEP393_ENABLED (Python 3.9) and kind == PyUnicode_WCHAR_KIND.

When compiling the C code, i get the warning

myfile.c: In function ‘__Pyx_UnicodeContainsUCS4’:
myfile.c:104840:1: warning: control reaches end of non-void function [-Wreturn-type]
104840 | }
| ^

Environment (please complete the following information):

  • OS: Linux
  • Python version: 3.9.0
  • Cython version: 0.29.21 (but also master)

Fix
Below is the current function with a single line added, constituting a fix. I suppose returning 1 is the correct thing to do in this seemingly overlooked case.

static CYTHON_INLINE int __Pyx_UnicodeContainsUCS4(PyObject* unicode, Py_UCS4 character) {
#if CYTHON_PEP393_ENABLED
    const int kind = PyUnicode_KIND(unicode);
    if (likely(kind != PyUnicode_WCHAR_KIND)) {
        Py_ssize_t i;
        const void* udata = PyUnicode_DATA(unicode);
        const Py_ssize_t length = PyUnicode_GET_LENGTH(unicode);
        for (i=0; i < length; i++) {
            if (unlikely(character == PyUnicode_READ(kind, udata, i))) return 1;
        }
        return 0;
    }
    return 1;  // <-- NEW
#elif PY_VERSION_HEX >= 0x03090000
    #error Cannot use "UChar in Unicode" in Python 3.9 without PEP-393 unicode strings.
#endif
#if PY_VERSION_HEX < 0x03090000
#if Py_UNICODE_SIZE == 2
    if (unlikely(character > 65535)) {
        return __Pyx_PyUnicodeBufferContainsUCS4_SP(
            PyUnicode_AS_UNICODE(unicode),
            PyUnicode_GET_SIZE(unicode),
            character);
    } else
#endif
    {
        return __Pyx_PyUnicodeBufferContainsUCS4_BMP(
            PyUnicode_AS_UNICODE(unicode),
            PyUnicode_GET_SIZE(unicode),
            character);

    }
#endif
}
@da-woods
Copy link
Contributor

da-woods commented Dec 1, 2020

I'd think it should be return 0 (or possibly return -1 + an error - I'm not sure right now what exceptions specification that has) rather than return 1 that you add.

I don't think "string is a confusing type" translates to "found the character"

@jmd-dk
Copy link
Contributor Author

jmd-dk commented Dec 1, 2020

I don't think "string is a confusing type" translates to "found the character"

Ah right, 1 is the "success" value. I meant return 0 then, though a third value for error handling would of course be ideal.

@jmd-dk
Copy link
Contributor Author

jmd-dk commented Dec 4, 2020

On further thought, the return should probably be placed as the very last line in __Pyx_UnicodeContainsUCS4(), outside all macros, as the problem is not exclusive to #if CYTHON_PEP393_ENABLED.

@robertwb
Copy link
Contributor

Yes, it would be better to return an error in this case, possibly falling back to the code in the lower half of the function which will raise the right error, or, if this is not appropriate for Python 3.9, a call to the appropriate Python C API function.

da-woods added a commit to da-woods/cython that referenced this issue Apr 24, 2021
cython#3925

This is a corner-case mostly to silence C warnings. It can happen
with strings created manually (i.e. not by Cython) using the deprecated
C-API on Python 3.9+

I haven't added a test since it seems quite hard to construct a
string that fails it
scoder added a commit to scoder/cython that referenced this issue Apr 24, 2021
…h Py3.9+ and prepare for Py3.12 where "PyUnicode_WCHAR_KIND" will be removed.

See https://www.python.org/dev/peps/pep-0623/
Closes cython#3925
scoder added a commit that referenced this issue Apr 25, 2021
…h Py3.9+. (GH-4135)

* Use the same fallback as for missing PEP-393 support.
* Prepare for "PyUnicode_READY()" and "PyUnicode_WCHAR_KIND" to be removed in Py3.12.
  See https://www.python.org/dev/peps/pep-0623/
* Avoid C compiler warnings about deprecated C-API functions in Py3.9+.

Closes #3925
scoder added a commit that referenced this issue Apr 25, 2021
…h Py3.9+. (GH-4135)

* Use the same fallback as for missing PEP-393 support.
* Prepare for "PyUnicode_READY()" and "PyUnicode_WCHAR_KIND" to be removed in Py3.12.
  See https://www.python.org/dev/peps/pep-0623/
* Avoid C compiler warnings about deprecated C-API functions in Py3.9+.

Closes #3925
@scoder scoder added this to the 0.29.24 milestone Apr 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants