Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault on s390x #333

Closed
tjaalton opened this issue Nov 24, 2023 · 10 comments
Closed

segfault on s390x #333

tjaalton opened this issue Nov 24, 2023 · 10 comments

Comments

@tjaalton
Copy link

What went wrong?

segfault running ipa-client-install on s390x, happens also with 1.8.3

Program received signal SIGSEGV, Segmentation fault.
0x000003fffce82804 in ?? () from /usr/lib/python3/dist-packages/gssapi/raw/_enum_extensions/ext_dce.cpython-310-s390x-linux-gnu.so
(gdb) bt
#0 0x000003fffce82804 in ?? ()
   from /usr/lib/python3/dist-packages/gssapi/raw/_enum_extensions/ext_dce.cpython-310-s390x-linux-gnu.so
#1 0x000002aa00214248 in PyModule_ExecDef (
    module=<module at remote 0x3fffd4a2340>, def=<optimized out>)
    at ../Objects/moduleobject.c:407
#2 0x000002aa002150c6 in _imp_exec_builtin_impl (mod=<optimized out>,
    module=<optimized out>) at ../Python/import.c:2091
#3 _imp_exec_builtin (module=<optimized out>, mod=<optimized out>)
    at ../Python/clinic/import.c.h:388
#4 0x000002aa001171d4 in cfunction_vectorcall_O (
    func=<built-in method exec_dynamic of module object at remote 0x3fffd796610>, args=0x3fffd459408, nargsf=<optimized out>, kwnames=<optimized out>)
    at ../Objects/methodobject.c:516
#5 0x000002aa000feee4 in do_call_core (kwdict={},
    callargs=(<module at remote 0x3fffd4a2340>,), func=<optimized out>,
    trace_info=<optimized out>, tstate=<optimized out>)
    at ../Python/ceval.c:5945
#6 _PyEval_EvalFrameDefault (tstate=<optimized out>,
    f=Frame 0x3fffd44e050, for file <frozen importlib._bootstrap>, line 241, in _call_with_frames_removed (f=<built-in method exec_dynamic of module object at remote 0x3fffd796610>, args=(<module at remote 0x3fffd4a2340>,), kwds={}),
    throwflag=<optimized out>) at ../Python/ceval.c:4277

How do we reproduce?

needs an s390x host to join an IPA domain

Component versions (python-gssapi, Kerberos, OS / distro, etc.)

Happens at least on Ubuntu 22.04:
python-gssapi 1.6.12 (also tested with 1.8.3)
cython 0.29.28

and 23.10:
python-gssapi 1.8.2
cython 0.29.36

@jborean93
Copy link
Contributor

I unfortunately don't have a host to really test this on so it's not going to be something I can easily fix. One thing that would be good to know is whether it fails trying to import gssapi or whether it fails when you are actually using it.

@jborean93
Copy link
Contributor

I've tried to reproduce this with the latest release using binfmt to run under a different architecture and was unable to replicate the stacktrace.

# podman run --rm -it --arch=s390x debian:11

apt-get update
apt-get upgrade

export DEBIAN_FRONTEND=noninteractive
apt-get install python3 python3-pip python3-venv python3-dev gcc krb5-user libkrb5-dev git curl wget vim

python3 -m venv .venv
. .venv/bin/activate
python -m pip install gssapi requests

# Configure your krb5.conf as needed
kinit username@DOMAIN.COM
python winrm_test.py target-host.domain.com whoami

The winrm_test.py script is from https://gist.github.com/jborean93/4bc4f20bff0ec6a2496eb511d055a8fa. While you may not have WinRM available this tests out the gss_init_set_context, gss_wrap_iov, and gss_unwrap_iov call with the latter two part of the ext_dce extension referenced in your stacktrace. From my perspective the issue doesn't seem to happen on 1.8.3 in this test environment. Without more info there's not much else I can do sorry.

@frank-heimes
Copy link

frank-heimes commented Nov 30, 2023

Thx jborean93 for taking a look.

It was already tested with a test build of a python3-gssapi package in v1.8.3:
https://launchpad.net/~freeipa/+archive/ubuntu/staging/+packages
but still failed:
https://bugs.launchpad.net/ubuntu/+source/python-gssapi/+bug/2044242/comments/11

I cannot follow your way on trying to recreate, since I run into this error:
$ python -m pip install --proxy=http://squid.internal:3128 gssapi requests
Collecting gssapi
Downloading gssapi-1.8.3.tar.gz (94 kB)

                    • 94.2/94.2 KB 633.2 kB/s eta 0:00:00
                      Installing build dependencies ... error
                      error: subprocess-exited-with-error

    × pip subprocess to install build dependencies did not run successfully.
    │ exit code: 1
    ╰─> [7 lines of output]
    WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.HTTPSConnection object at 0x3ff80bd0130>, 'Connection to pypi.org timed out. (connect timeout=15)')': /simple/cython/
    WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.HTTPSConnection object at 0x3ff80bd38e0>, 'Connection to pypi.org timed out. (connect timeout=15)')': /simple/cython/
    WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.HTTPSConnection object at 0x3ff80bd3c10>, 'Connection to pypi.org timed out. (connect timeout=15)')': /simple/cython/
    WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.HTTPSConnection object at 0x3ff80bd07c0>, 'Connection to pypi.org timed out. (connect timeout=15)')': /simple/cython/
    WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.HTTPSConnection object at 0x3ff80bd2a40>, 'Connection to pypi.org timed out. (connect timeout=15)')': /simple/cython/
    ERROR: Could not find a version that satisfies the requirement Cython<4.0.0,>=0.29.29 (from versions: none)
    ERROR: No matching distribution found for Cython<4.0.0,>=0.29.29
    [end of output]

    note: This error originates from a subprocess, and is likely not a problem with pip.
    error: subprocess-exited-with-error

× pip subprocess to install build dependencies did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
<It's maybe due to the slow connectivity, cause by the proxy I have to use on this s390x system.>

But anyway, I believe your attempt will also not work for me - on a s390x system.

Because it works for me fine (even using using ipa-client-install) on an amd64 system:


root@jammy:~# sudo ipa-client-install
This program will set up IPA client.
Version 4.9.8

WARNING: conflicting time&date synchronization service 'ntp' will be disabled in favor of chronyd

DNS discovery failed to determine your DNS domain
Provide the domain name of your IPA server (ex: example.com):


it's highly likely an s390x-specific issue!

@jborean93
Copy link
Contributor

it's highly likely an s390x-specific issue!

Sorry if I wasn't clear the test I ran was using s390x emulation with podman run --rm -it --arch=s390x debian:11. I unfortunately don't have access to a real host to test on but this should be a fairly accurate representation. The errors you have seem to be a problem with the proxy setup as it's failing to access PyPI. You could download them offline and install it that way but if you also don't have a Windows host accessible the test script I gave you won't really be usable anyway.

I've now retested it with an Ubuntu 22.04 image and now can replicate the seg fault when installing the package from the normal repos.

# podman run --rm -it --arch=s390x ubuntu:22.04

apt-get update
apt-get install python3-gssapi

# Segmentation fault
python3 -c "import gssapi"

The same test in Debian 11 and 12 works just fine though indicating their built packages are fine. Ubuntu 22.04 ships with gssapi 1.6.12, Debian 11 with 1.6.11, and Debian 12 with 1.8.2. The fact that it works fine in Debian for 1.6.11 makes me suspicious of the build process that created the artifact.

The next thing I tried was manually building it myself using the Python sdist. This is how I originally installed it with my previous Debian test but this time round I tried Ubuntu 22.04.

# podman run --rm -it --arch=s390x ubuntu:22.04

apt-get update
apt-get install python3 python3-dev python3-pip python3-venv gcc libkrb5-dev krb5-user

python3 -m venv .venv
. .venv/bin/activate

python3 -m pip install 'gssapi==1.6.12' --verbose

python3 -c "import gssapi"

This fails with

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/.venv/lib/python3.10/site-packages/gssapi/__init__.py", line 31, in <module>
    from gssapi.raw.types import NameType, RequirementFlag, AddressType  # noqa
  File "/.venv/lib/python3.10/site-packages/gssapi/raw/__init__.py", line 50, in <module>
    from gssapi.raw.creds import *  # noqa
  File "gssapi/raw/creds.pyx", line 1, in init gssapi.raw.creds
ImportError: /.venv/lib/python3.10/site-packages/gssapi/raw/oids.cpython-310-s390x-linux-gnu.so: undefined symbol: _PyGen_Send

Which is somewhat expected as the sdists for 1.6.x included already cythonized .c files which always had problems with Python versions released after that point in time. In the 1.8.x releases we migrated to a PEP 517 compliant sdist where the installer needed to cythonize the files as part of the build process allowing it to take advantage of newer Cython versions at that point in time. We can ignore this as Ubuntu's builds use pure source files and will Cythonize the files with a newer version of Cython that generated the ones in the sdist.

Using a newer Cython version alongside with the original source data we can see that it is fine with the built package.

# podman run --rm -it --arch=s390x ubuntu:22.04

apt-get update
apt-get install python3 python3-dev python3-pip python3-venv gcc libkrb5-dev krb5-user

python3 -m venv .venv
. .venv/bin/activate

# Cython 3 support was added in v1.8.3
python3 -m pip install 'cython<3.0.0'
python3 -m pip install https://github.com/pythongssapi/python-gssapi/archive/v1.6.12.tar.gz --verbose

python3 -c "import gssapi"

This works just fine and the package does not error or segfault here. When looking at the build logs for this package for Jammy at https://launchpadlibrarian.net/591008002/buildlog_ubuntu-jammy-s390x.python-gssapi_1.6.12-1build2_BUILDING.txt.gz I can see that the build process is using the package called cython3 which for Jammy is at 0.29.28. The cython3 here is the package name and does not seem to be Cython 3.x.y which is an important distinction to make.

...
Setting up python3-pycodestyle (2.8.0-2) ...
Setting up cython3 (0.29.28-1ubuntu1) ...
Setting up libpython3.10-dev:s390x (3.10.3-1) ...
Setting up python3.10-dev (3.10.3-1) ...
Setting up python3-mccabe (0.6.1-3) ...
...

When building from the same build2 source as the Jammy build with the cython3 package rather than Cython from PyPI it still works just fine.

# podman run --rm -it --arch=s390x ubuntu:22.04

apt-get update
apt-get install python3 python3-dev python3-pip gcc libkrb5-dev krb5-user cython3

python3 -m pip install https://launchpad.net/ubuntu/+archive/primary/+sourcefiles/python-gssapi/1.6.12-1build2/python-gssapi_1.6.12.orig.tar.gz --verbose


python3 -c "import gssapi"

Unfortunately at this point I don't know what else it could be. Building it locally with the same Cythonized files and source as the build job seems to work just fine but the pre-built package is having troubles. At this point I'm not sure what else I can do here as it seems to be more of a build problem rather than an actual code problem.

@stliibm
Copy link

stliibm commented Dec 1, 2023

Hi,

I see this issue on Ubuntu 23.10 on my real s390x machine with python3-gssapi/mantic 1.8.2-1build1.
After also installing python3-gssapi-dbgsym, the backtrace looks like this:

(gdb) bt
#0  __pyx_pymod_exec_ext_dce (__pyx_pyinit_module=<optimized out>)
    at gssapi/raw/_enum_extensions/ext_dce.c:1627
#1  0x00000000011d7ec8 in PyModule_ExecDef (module=0x3fff70ef970, def=<optimized out>)
    at ../Objects/moduleobject.c:419
#2  0x00000000011e1136 in _imp_exec_builtin_impl (mod=<optimized out>, module=<optimized out>)
    at ../Python/import.c:2427
#3  _imp_exec_builtin (module=<optimized out>, mod=<optimized out>) at ../Python/clinic/import.c.h:504
#4  0x00000000010dba28 in cfunction_vectorcall_O (func=0x3fff7c8a890, args=0x3fff75ce188, 
    nargsf=<optimized out>, kwnames=<optimized out>) at ../Include/cpython/methodobject.h:52
#5  0x00000000010c05a4 in do_call_core (use_tracing=<optimized out>, kwdict=0x3fff706bf80, 
    callargs=0x3fff75ce170, func=0x3fff7c8a890, tstate=<optimized out>) at ../Python/ceval.c:7343
#6  _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>)
at ../Python/ceval.c:5367

The segfault happens here:

0x3fff6e82606 <__pyx_pymod_exec_ext_dce+1022>   lgrl    %r2,0x400016418c0
0x3fff6e82606 <__pyx_pymod_exec_ext_dce+1022>:  0xc4    0x28    0x05    0x3d    0xf9    0x5d

=> No mapping at 0x400016418c0 thus we get the segfault.

With objdump, we see that there is a relocation for PyExc_ImportError ...:

objdump -DrR /usr/lib/python3/dist-packages/gssapi/raw/_enum_extensions/ext_dce.cpython-311-s390x-linux-gnu.so
    2606:	c4 28 00 00 00 00 	lgrl	%r2,2606 <PyCode_NewEmpty@plt+0x86e>
2608: R_390_PC32DBL	PyExc_ImportError@Base+0x2

... which can be found in the python-binary:

readelf -Ws /usr/bin/python3
Symbol table '.dynsym' contains 2206 entries:
Num:     Value          Size Type    Bind   Vis      Ndx Name
...
1317: 00000000016418c0     8 OBJECT  GLOBAL DEFAULT   24 PyExc_ImportError

The calculation of the relocation at runtime is:

(0x16418c0  + 0x2 - 0x3fff6e82608) >> 1 ==TRUNCATED-TO-INT== 0x53DF95D

The lgrl-instruction then loads from here as observed by the segfault:
0x3fff6e82606 + (0x53DF95D >> 1) = 0x400016418C0

The sources for python3-gssapi does not contain the c-file gssapi/raw/_enum_extensions/ext_dce.c but gssapi/raw/_enum_extensions/ext_dce.pyx.
I'm not familiar with the build-process from the pyx file to the final shared library. Can you please give me some hints how I get to the point how the c-files are generated and gcc is called.

@jborean93
Copy link
Contributor

Thanks for the great investigation there it is very helpful.

Can you please give me some hints how I get to the point how the c-files are generated and gcc is called.

The code that interacts with the C gssapi is written in .pyx files. These .pyx files are used by Cython to auto generate the actual .c files that are then compiled. Part of the build process is to "cythonize" the files which gets Cython to generate the .c files which are then compiled using whatever setuptools has configured for the platform. To do this manually you can use the command python -m cython path/to/file.pyx which for this project would be

# -2 is language mode 2 which this project still uses
python3 -m cython -2 gssapi/raw/*.pyx gssapi/raw/_enum_extensions/*.pyx

The version of Cython used here is very important as the generated code can differ across versions to either include bugfixes or support for newer Python versions. For Ubuntu it should typically correspond to the cython3 package installed by apt but you'll have to verify by the build logs to see what was actually used for a package.

The build logs for s309x show all these steps to see what Ubuntu is doing to build their package. The build process according to these logs is that it retrieves the source code, installs some required packages (including the cython3 package) and then compiles the Python library. You can see the start of the Python build process at the line I: pybuild base:237: /usr/bin/python3 setup.py build . A tiny bit further down there you can see the actual commands used to compile the auto generated c files:

s390x-linux-gnu-gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -flto=auto -ffat-lto-objects -flto=auto -ffat-lto-objects -specs=/usr/share/dpkg/no-pie-compile.specs -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -Igssapi/raw -I./gssapi/raw -I/usr/include/python3.10 -c gssapi/raw/misc.c -o build/temp.linux-s390x-3.10/gssapi/raw/misc.o -isystem /usr/include/mit-krb5 -DHAS_GSSAPI_EXT_H

@stliibm
Copy link

stliibm commented Dec 4, 2023

Thanks a lot for the explanation.

Seems to be an issue with the Ubuntu gcc (which is configured with --enable-default-pie), some issue in the past with pie in the python3-gssapi Ubuntu package (unfortunately no hint in the debian/changelog file) and that LTO is used by default. In addition there seems to be a bug in s390x part of gcc/linker in this scenario as I assume it works on other architectures.

From debian/rules:

# PIE currently breaks compilation
export DEB_BUILD_MAINT_OPTIONS = hardening=+all,-pie

If I either rebuild the package without disabling pie:
export DEB_BUILD_MAINT_OPTIONS = hardening=+all
or without lto:
export DEB_BUILD_MAINT_OPTIONS = hardening=+all,-pie optimize=-lto

... I can start ipa-client-install as root:

$ ipa-client-install
This program will set up IPA client.
Version 4.10.2

WARNING: conflicting time&date synchronization service 'ntp' will be disabled in favor of chronyd

DNS discovery failed to determine your DNS domain
Provide the domain name of your IPA server (ex: example.com):

=> Stopped here as I don't have an IPA server and can't reconfigure my system.

Without lto, the shared library contains this relocation in got-slot:
0000000000004fe0 000000310000000a R_390_GLOB_DAT 0000000000000000 PyExc_ImportError + 0
And the code loads from this relocated got-slot:
lgrl %r14,4fe0

@jborean93
Copy link
Contributor

Thanks for the investigation I was able to replicate the stacktrace with the following reproducer now.

# podman run --rm -it --arch=s390x ubuntu:22.04

apt-get update

export DEBIAN_FRONTEND=noninteractive
apt-get install python3 python3-dev python3-pip gcc libkrb5-dev krb5-user cython3

python3 -m pip install 'cython<3.0.0'
CFLAGS="-flto=auto -ffat-lto-objects -specs=/usr/share/dpkg/no-pie-compile.specs" python3 -m pip install https://launchpad.net/ubuntu/+archive/primary/+sourcefiles/python-gssapi/1.6.12-1build2/python-gssapi_1.6.12.orig.tar.gz --verbose

Unfortunately even when testing with Cython 3.x.y the problem still exists so I've opened cython/cython#5893 with a simple reproducer with just Cython. There seems to be a fundamental problem there that it a bit out of my depth but hopefully that reproducer can help figure out what the problem is with the auto generated code.

@frank-heimes
Copy link

Based on stliibm's investigations I've re-build the package without LTO and using this things worked for me.

Cross-posting from Launchpad:
https://bugs.launchpad.net/ubuntu/+source/python-gssapi/+bug/2044242/comments/16

@jborean93
Copy link
Contributor

Closing as it looks like Ubuntu has adjusted their build process to solve this issue and the fundamental problem originally reported is due to the Cython C generated code as reported in cython/cython#5893.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants