Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] fstring.format_c_numbers crashes with bus error #4343

Closed
DerDakon opened this issue Aug 15, 2021 · 13 comments
Closed

[BUG] fstring.format_c_numbers crashes with bus error #4343

DerDakon opened this issue Aug 15, 2021 · 13 comments

Comments

@DerDakon
Copy link

DerDakon commented Aug 15, 2021

Describe the bug

Running the testsuite on my Sparc machine (these are very picky about unaligned accesses) crashes in fstring.format_c_numbers.

Doctest: fstring.format_c_number_range_width_m4 ... ok
format_c_numbers (fstring)
Doctest: fstring.format_c_numbers ... /var/tmp/portage/dev-python/cython-0.29.24/temp/environment: line 3005:    65 Bus error               "${PYTHON}" runtests.py -vv --work-dir "${BUILD_DIR}"/tests

Environment (please complete the following information):

  • OS: Gentoo Linux
  • Python 3.9.6
  • Cython 0.29.22, 0.29.24

The error did not happen in 0.9.21

I'll try to get a backtrace, a hint on the right commandline to run this one test under gdb is welcome.

@da-woods
Copy link
Contributor

da-woods commented Aug 15, 2021

Backtrace would always be useful.

What I'd probably do is run it outside the test-suite: something like:

CFLAGS=-Og python3 cythonize.py tests/run/fstring.pyx  # builds the file with debug flags
# change to the tests/run directory
gdb python3
# in gdb
run
# in Python
import fstring
import doctest
doctest.testmod(fstring)

(that's untested though, but hopefully gives you a start anyway).

Other things that might be useful:

  • Try with define -DCYTHON_USE_PYLONG_INTERNALS=0 as a CFLAG (my uneducated guess is that the problem lies here and that the define will fix the problem)
  • git bisect (between 0.29.21 and 0.29.21) to pin it down to a commit

@DerDakon
Copy link
Author

Something in between is missing:

>>> import fstring
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'fstring'
>>> import doctest
>>>

@da-woods
Copy link
Contributor

da-woods commented Aug 18, 2021

Possibly

# change to the tests/run directory

Hopefully there should be a fstring.something.so file in the directory you're running Python from

@DerDakon
Copy link
Author

#0  0xf729b060 in __pyx_pw_7fstring_9format_c_numbers () from /var/tmp/portage/dev-python/cython-0.29.24/work/cython-0.29.24-python3_9/tests/run/c/fstring/fstring.cpython-39-sparc-linux-gnu.so
#1  0xf7d13918 in ?? () from /usr/lib/libpython3.9.so.1.0

@DerDakon
Copy link
Author

That so is from the original build. If I delete it I don't get it again, so some step is missing I fear.

chroot /var/tmp/portage/dev-python/cython-0.29.24/work/cython-0.29.24 #  CFLAGS="-Og -Wcast-align -g" python3 cythonize.py tests/run/fstring.pyx
Compiling /var/tmp/portage/dev-python/cython-0.29.24/work/cython-0.29.24/tests/run/fstring.pyx because it changed.
[1/1] Cythonizing /var/tmp/portage/dev-python/cython-0.29.24/work/cython-0.29.24/tests/run/fstring.pyx
/var/tmp/portage/dev-python/cython-0.29.24/work/cython-0.29.24/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /var/tmp/portage/dev-python/cython-0.29.24/work/cython-0.29.24/tests/run/fstring.pyx
  tree = Parsing.p_module(s, pxd, full_module_name)
chroot /var/tmp/portage/dev-python/cython-0.29.24/work/cython-0.29.24 #  find ../.. -name fstring\*.so
chroot /var/tmp/portage/dev-python/cython-0.29.24/work/cython-0.29.24 #

@mgorny
Copy link
Contributor

mgorny commented Aug 18, 2021

@DerDakon, I think you need to pass -i to cythonize.

Here's the backtrace I've gotten:

Program received signal SIGBUS, Bus error.
0xfff8000101777324 in __Pyx_PyUnicode_From_signed__char (format_char=<optimized out>, padding_char=<optimized out>, 
    width=<optimized out>, value=<optimized out>) at /home/mgorny/cython/tests/run/fstring.c:9265
9265	            *(uint16_t*)dpos = ((const uint16_t*)DIGIT_PAIRS_10)[digit_pos];
(gdb) bt
#0  0xfff8000101777324 in __Pyx_PyUnicode_From_signed__char (format_char=<optimized out>, padding_char=<optimized out>, 
    width=<optimized out>, value=<optimized out>) at /home/mgorny/cython/tests/run/fstring.c:9265
#1  __pyx_pf_7fstring_8format_c_numbers (__pyx_self=__pyx_self@entry=0x0, __pyx_v_c=__pyx_v_c@entry=123 '{', 
    __pyx_v_s=__pyx_v_s@entry=135, __pyx_v_n=__pyx_v_n@entry=12, __pyx_v_l=__pyx_v_l@entry=12312312, 
    __pyx_v_f=__pyx_v_f@entry=2.34559989, __pyx_v_d=3.1415926000000001) at /home/mgorny/cython/tests/run/fstring.c:2737
#2  0xfff80001017864b8 in __pyx_pw_7fstring_9format_c_numbers (__pyx_self=0x0, __pyx_args=<optimized out>, __pyx_kwds=<optimized out>)
    at /home/mgorny/cython/tests/run/fstring.c:2701
#3  0xfff8000100223840 in ?? () from /usr/lib64/libpython3.9.so.1.0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

@mgorny
Copy link
Contributor

mgorny commented Aug 18, 2021

However, the warning output really tells already what's wrong:

/home/mgorny/cython/tests/run/fstring.c:9128:14: warning: cast increases required alignment of target type [-Wcast-align]
 9128 |             *(uint16_t*)dpos = ((const uint16_t*)DIGIT_PAIRS_8)[digit_pos];
      |              ^

In Cython/Utility/TypeConversion.c, the two lines:

*(uint16_t*)dpos = ((const uint16_t*)DIGIT_PAIRS_8)[digit_pos]; /* copy 2 digits at a time */

and

*(uint16_t*)dpos = ((const uint16_t*)DIGIT_PAIRS_10)[digit_pos]; /* copy 2 digits at a time */

are wrongly forcing 16-bit alignment. I can avoid the SIGBUS by forcing two 8-bit ops instead.

@DerDakon
Copy link
Author

Simply use memcpy(), the compiler knows when it can be reduced to a MOV or something similar because the platform does not need special alignment handling.

@da-woods
Copy link
Contributor

da-woods commented Aug 18, 2021

Thanks @DerDakon and @mgorny (and sorry for my incomplete debugging suggestions). It sounds like memcpy is probably the right fix. Although we don't test Sparc so it's probably quite hard for us to catch any regressions.

The code in question looks to be about 5 years old so I'm not sure what changed that it started failing (probably doesn't really matter though)

@scoder
Copy link
Contributor

scoder commented Aug 31, 2021

Simply use memcpy(), the compiler knows when it can be reduced to a MOV or something similar because the platform does not need special alignment handling.

To be clear: are you suggesting to call memcpy() repeatedly to copy 2 byte pairs? I understand that memcpy() is something that C compilers would commonly special case.

This code in intended to compile down to a fast loop copying indirectly addressed bytes or byte pairs from a lookup table. Maybe we can just disable the two-byte copying on platforms that require aliasing and fall back to single bytes. But it's obviously more risky to detect such platforms ourselves than to let the C compiler do it for us. If memcpy() does that, fine.

@DerDakon
Copy link
Author

Yes, just always use memcpy(). You can use the compiler explorer and use something like this to see that basically no compiler on the standard platforms actually issues any function call:

#include <cstring>
short shorten(int *nums, int idx) {
    short foo;
    memcpy(&foo, nums + idx, sizeof(foo));
    return foo;
}

Ok, not exactly. "ARM gcc 8.3.1" does issue a memcpy(), but version 9 already optimizes that out. For x86-64 not even gcc 4.1 issues a call. MSVC x86 seems particularly dumb, but who cares for that these days anyway?

@da-woods
Copy link
Contributor

da-woods commented Aug 31, 2021

MSVC x86 seems particularly dumb, but who cares for that these days anyway?

MSVC is essentially the only way to get Cython to run on Windows (which is probably a more important platform to support that Sparc ;) ).

Fortunately it does get it right and remove the memcpy once you compile it in optimized mode so I don't think there's an issue there. I suspect that's the same with most of the compilers on there - if you add an O2 flag (which distutils does by default I think) they'll get rid of it

@DerDakon
Copy link
Author

I meant Windows x86. I suspect that beyond some tiny pseudo-embedded things noone has ever seen a recent Windows version on such a machine. Let alone one that isn't actually capable of running it in 64 bit mode instead.

scoder added a commit to scoder/cython that referenced this issue Aug 31, 2021
…em to be aligned. Use memcpy() instead to let the C compiler decide how to do it.

Closes cython#4343
@scoder scoder closed this as completed in 2c08fd5 Sep 1, 2021
scoder added a commit that referenced this issue Sep 1, 2021
…em to be aligned. Use memcpy() instead to let the C compiler decide how to do it.

Closes #4343
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants