Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEV at hashtable_find_pair #523

Closed
tpgxyz opened this issue Jan 30, 2020 · 9 comments · Fixed by #540
Closed

SIGSEV at hashtable_find_pair #523

tpgxyz opened this issue Jan 30, 2020 · 9 comments · Fixed by #540

Comments

@tpgxyz
Copy link
Contributor

tpgxyz commented Jan 30, 2020

Hi,
i've noticed that firewalld-0.8.1 started to coredumps on OpenMandriva Linux distribution:

Here are the logs

[root@tpg-virtualbox tpg]# coredumpctl debug 689                 
           PID: 689 (firewalld)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 11 (SEGV)
     Timestamp: Thu 2020-01-30 14:32:45 CET (14min ago)
  Command Line: /usr/bin/python /usr/sbin/firewalld --nofork --nopid
    Executable: /usr/bin/python3.8
 Control Group: /system.slice/firewalld.service
          Unit: firewalld.service
         Slice: system.slice
       Boot ID: df6c6a86ce5246baa0cbb48107efd6c2
    Machine ID: 8e3f63c507904cf4b2186f8e6939f94c
      Hostname: tpg-virtualbox
       Storage: /var/lib/systemd/coredump/core.firewalld.0.df6c6a86ce5246baa0cbb48107efd6c2.689.1580391165000000000000.lz4
       Message: Process 689 (firewalld) of user 0 dumped core.
                
                Stack trace of thread 689:
                #0  0x00007fa5920cba3f hashtable_find_pair (libjansson.so.4 + 0x6a3f)
                #1  0x00007fa5920cbac5 hashtable_get (libjansson.so.4 + 0x6ac5)
                #2  0x00007fa5920cf449 unpack (libjansson.so.4 + 0xa449)
                #3  0x00007fa5920ceb48 json_vunpack_ex (libjansson.so.4 + 0x9b48)
                #4  0x00007fa5920cf980 json_unpack (libjansson.so.4 + 0xa980)
                #5  0x00007fa592185eef __json_parse (libnftables.so.1.0.0 + 0x70eef)
                #6  0x00007fa59217c010 nft_run_cmd_from_buffer (libnftables.so.1.0.0 + 0x67010)
                #7  0x00007fa5930af13d n/a (libffi.so.7 + 0x313d)
                #8  0x00007fa5930b3f02 n/a (libffi.so.7 + 0x7f02)
                #9  0x00007fa592953faa _ctypes_callproc (_ctypes.cpython-38-x86_64-linux-gnu.so + 0x15faa)
                #10 0x00007fa59294ed71 PyCFuncPtr_call (_ctypes.cpython-38-x86_64-linux-gnu.so + 0x10d71)
                #11 0x00007fa596c649f7 _PyObject_MakeTpCall (libpython3.8.so.1.0 + 0x17e9f7)
                #12 0x00007fa596d89924 _PyEval_EvalFrameDefault (libpython3.8.so.1.0 + 0x2a3924)
                #13 0x00007fa596c66216 _PyFunction_Vectorcall (libpython3.8.so.1.0 + 0x180216)
                #14 0x00007fa596d896f6 _PyEval_EvalFrameDefault (libpython3.8.so.1.0 + 0x2a36f6)
                #15 0x00007fa596c66216 _PyFunction_Vectorcall (libpython3.8.so.1.0 + 0x180216)
                #16 0x00007fa596d896f6 _PyEval_EvalFrameDefault (libpython3.8.so.1.0 + 0x2a36f6)
                #17 0x00007fa596c66216 _PyFunction_Vectorcall (libpython3.8.so.1.0 + 0x180216)
                #18 0x00007fa596d896f6 _PyEval_EvalFrameDefault (libpython3.8.so.1.0 + 0x2a36f6)
                #19 0x00007fa596c66216 _PyFunction_Vectorcall (libpython3.8.so.1.0 + 0x180216)
                #20 0x00007fa596d896f6 _PyEval_EvalFrameDefault (libpython3.8.so.1.0 + 0x2a36f6)
                #21 0x00007fa596c66216 _PyFunction_Vectorcall (libpython3.8.so.1.0 + 0x180216)
                #22 0x00007fa596d896f6 _PyEval_EvalFrameDefault (libpython3.8.so.1.0 + 0x2a36f6)
                #23 0x00007fa596d7feab _PyEval_EvalCodeWithName (libpython3.8.so.1.0 + 0x299eab)
                #24 0x00007fa596c6617f _PyFunction_Vectorcall (libpython3.8.so.1.0 + 0x18017f)
                #25 0x00007fa596d896f6 _PyEval_EvalFrameDefault (libpython3.8.so.1.0 + 0x2a36f6)
                #26 0x00007fa596c66216 _PyFunction_Vectorcall (libpython3.8.so.1.0 + 0x180216)
                #27 0x00007fa596d896f6 _PyEval_EvalFrameDefault (libpython3.8.so.1.0 + 0x2a36f6)
                #28 0x00007fa596c66216 _PyFunction_Vectorcall (libpython3.8.so.1.0 + 0x180216)
                #29 0x00007fa596c65478 PyVectorcall_Call (libpython3.8.so.1.0 + 0x17f478)
                #30 0x00007fa596d89e7a _PyEval_EvalFrameDefault (libpython3.8.so.1.0 + 0x2a3e7a)
                #31 0x00007fa596d7feab _PyEval_EvalCodeWithName (libpython3.8.so.1.0 + 0x299eab)
                #32 0x00007fa596c6617f _PyFunction_Vectorcall (libpython3.8.so.1.0 + 0x18017f)
                #33 0x00007fa596d89a17 _PyEval_EvalFrameDefault (libpython3.8.so.1.0 + 0x2a3a17)
                #34 0x00007fa596c66216 _PyFunction_Vectorcall (libpython3.8.so.1.0 + 0x180216)
                #35 0x00007fa596d896f6 _PyEval_EvalFrameDefault (libpython3.8.so.1.0 + 0x2a36f6)
                #36 0x00007fa596d7feab _PyEval_EvalCodeWithName (libpython3.8.so.1.0 + 0x299eab)
                #37 0x00007fa596c6617f _PyFunction_Vectorcall (libpython3.8.so.1.0 + 0x18017f)
                #38 0x00007fa596c65478 PyVectorcall_Call (libpython3.8.so.1.0 + 0x17f478)
                #39 0x00007fa596d89e7a _PyEval_EvalFrameDefault (libpython3.8.so.1.0 + 0x2a3e7a)
                #40 0x00007fa596d7feab _PyEval_EvalCodeWithName (libpython3.8.so.1.0 + 0x299eab)
                #41 0x00007fa596c6617f _PyFunction_Vectorcall (libpython3.8.so.1.0 + 0x18017f)
                #42 0x00007fa596c65478 PyVectorcall_Call (libpython3.8.so.1.0 + 0x17f478)
                #43 0x00007fa596d89e7a _PyEval_EvalFrameDefault (libpython3.8.so.1.0 + 0x2a3e7a)
                #44 0x00007fa596d7feab _PyEval_EvalCodeWithName (libpython3.8.so.1.0 + 0x299eab)
                #45 0x00007fa596c6617f _PyFunction_Vectorcall (libpython3.8.so.1.0 + 0x18017f)
                #46 0x00007fa596c646d6 _PyObject_FastCallDict (libpython3.8.so.1.0 + 0x17e6d6)
                #47 0x00007fa596ce52fd slot_tp_init (libpython3.8.so.1.0 + 0x1ff2fd)
                #48 0x00007fa596ceed24 type_call (libpython3.8.so.1.0 + 0x208d24)
                #49 0x00007fa596c649f7 _PyObject_MakeTpCall (libpython3.8.so.1.0 + 0x17e9f7)
                #50 0x00007fa596d89b05 _PyEval_EvalFrameDefault (libpython3.8.so.1.0 + 0x2a3b05)
                #51 0x00007fa596d7feab _PyEval_EvalCodeWithName (libpython3.8.so.1.0 + 0x299eab)
                #52 0x00007fa596c6617f _PyFunction_Vectorcall (libpython3.8.so.1.0 + 0x18017f)
                #53 0x00007fa596d898af _PyEval_EvalFrameDefault (libpython3.8.so.1.0 + 0x2a38af)
                #54 0x00007fa596c66216 _PyFunction_Vectorcall (libpython3.8.so.1.0 + 0x180216)
                #55 0x00007fa596d89a17 _PyEval_EvalFrameDefault (libpython3.8.so.1.0 + 0x2a3a17)
                #56 0x00007fa596c66216 _PyFunction_Vectorcall (libpython3.8.so.1.0 + 0x180216)
                #57 0x00007fa596d89a17 _PyEval_EvalFrameDefault (libpython3.8.so.1.0 + 0x2a3a17)
                #58 0x00007fa596d7feab _PyEval_EvalCodeWithName (libpython3.8.so.1.0 + 0x299eab)
                #59 0x00007fa596de7770 run_eval_code_obj (libpython3.8.so.1.0 + 0x301770)
                #60 0x00007fa596de76bd run_mod (libpython3.8.so.1.0 + 0x3016bd)
                #61 0x00007fa596de52a7 PyRun_FileExFlags (libpython3.8.so.1.0 + 0x2ff2a7)
                #62 0x00007fa596de4ae7 PyRun_SimpleFileExFlags (libpython3.8.so.1.0 + 0x2feae7)
                #63 0x00007fa596e0e0bc pymain_run_file (libpython3.8.so.1.0 + 0x3280bc)

GNU gdb (GDB) 9.0.90.20191228-1 (OpenMandriva Lx release 4.1)
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-openmandriva-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/python3.8...
Reading symbols from /usr/lib/debug/usr/bin/python3.8-3.8.1-3.x86_64.debug...

warning: core file may not match specified executable file.
[New LWP 689]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

warning: the debug information found in "/usr/lib/debug//usr/lib64/libffi.so.7.1.0-3.3-1.x86_64.debug" does not match "/usr/lib64/libffi.so.7" (CRC mismatch).


warning: the debug information found in "/usr/lib/debug//usr/lib64/libffi.so.7.1.0-3.3-1.x86_64.debug" does not match "/usr/lib64/libffi.so.7" (CRC mismatch).

Core was generated by `/usr/bin/python /usr/sbin/firewalld --nofork --nopid'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  hashtable_find_pair () at hashtable.c:93
93              if(pair->hash == hash && strcmp(pair->key, key) == 0)
(gdb) bt full
#0  hashtable_find_pair () at hashtable.c:93
No locals.
#1  0x00007fa5920cbac5 in hashtable_get () at hashtable.c:279
No locals.
#2  0x00007fa5920cf449 in json_object_get () at value.c:98
No locals.
#3  unpack_object () at pack_unpack.c:551
No locals.
#4  unpack () at pack_unpack.c:686
No locals.
#5  0x00007fa5920ceb48 in json_vunpack_ex () at pack_unpack.c:915
No locals.
#6  0x00007fa5920cf980 in json_unpack () at pack_unpack.c:948
No locals.
#7  0x00007fa592185eef in __json_parse () from /usr/lib64/libnftables.so.1.0.0
No symbol table info available.
#8  0x00007fa59217c010 in nft_run_cmd_from_buffer () from /usr/lib64/libnftables.so.1.0.0
No symbol table info available.
#9  0x00007fa5930af13d in ?? () from /usr/lib64/libffi.so.7
No symbol table info available.
#10 0x00007fa5930b3f02 in ?? () from /usr/lib64/libffi.so.7
No symbol table info available.
#11 0x00007fa592953faa in _call_function_pointer (flags=4353, pProc=0x7fa59217bef5 <nft_run_cmd_from_buffer>, avalues=0x7fff27d4a590, atypes=<optimized out>, restype=<optimized out>, 
    resmem=0x7fff27d4a5a0, argcount=2) at Modules/_ctypes/callproc.c:871
        _save = 0x592bb0
        error_object = 0x0
        cc = 2
        cif = {abi = FFI_UNIX64, nargs = 2, arg_types = 0x7fff27d4a580, rtype = 0x7fa592964f28, bytes = 0, flags = 6}
        space = <optimized out>
        temp = <optimized out>
        temp = <optimized out>
#12 _ctypes_callproc (pProc=0x7fa59217bef5 <nft_run_cmd_from_buffer>, argtuple=0x0, flags=4353, argtypes=<optimized out>, restype=0x7d2fa0, checker=<optimized out>)
    at Modules/_ctypes/callproc.c:1199
        retval = 0x0
        n = 2
        argcount = 2
        args = 0x7fff27d4a5b0
        argtype_count = 5843888
        pa = <optimized out>
        i = <optimized out>
        rtype = <optimized out>
        resbuf = 0x7fff27d4a5a0
        avalues = 0x7fff27d4a590
        atypes = <optimized out>
#13 0x00007fa59294ed71 in PyCFuncPtr_call (self=0x7fa59227cf40, inargs=0x7fa591fd3e00, kwds=0x0) at Modules/_ctypes/_ctypes.c:4181
        dict = <optimized out>
        pProc = 0x0
        restype = <optimized out>
        converters = <optimized out>
        checker = <optimized out>
--Type <RET> for more, q to quit, c to continue without paging--
        argtypes = <optimized out>
        errcheck = 0x0
        numretvals = 0
        inoutmask = 0
        outmask = 0
        callargs = 0x7fa591fd3e00
        result = <optimized out>
#14 0x00007fa596c649f7 in _PyObject_MakeTpCall () at Objects/call.c:159
No locals.
#15 0x00007fa596d89924 in _PyEval_EvalFrameDefault () at ./Include/cpython/abstract.h:125
No locals.
#16 0x00007fa596c66216 in PyEval_EvalFrameEx () at Python/ceval.c:741
No locals.
#17 function_code_fastcall () at Objects/call.c:283
No locals.
#18 _PyFunction_Vectorcall () at Objects/call.c:410
No locals.
#19 0x00007fa596d896f6 in _PyEval_EvalFrameDefault () at ./Include/cpython/abstract.h:127
No locals.
#20 0x00007fa596c66216 in PyEval_EvalFrameEx () at Python/ceval.c:741
No locals.
#21 function_code_fastcall () at Objects/call.c:283
No locals.
#22 _PyFunction_Vectorcall () at Objects/call.c:410
No locals.
#23 0x00007fa596d896f6 in _PyEval_EvalFrameDefault () at ./Include/cpython/abstract.h:127
No locals.
#24 0x00007fa596c66216 in PyEval_EvalFrameEx () at Python/ceval.c:741
No locals.
#25 function_code_fastcall () at Objects/call.c:283
No locals.
#26 _PyFunction_Vectorcall () at Objects/call.c:410
No locals.
#27 0x00007fa596d896f6 in _PyEval_EvalFrameDefault () at ./Include/cpython/abstract.h:127
No locals.
#28 0x00007fa596c66216 in PyEval_EvalFrameEx () at Python/ceval.c:741
No locals.
#29 function_code_fastcall () at Objects/call.c:283
No locals.
#30 _PyFunction_Vectorcall () at Objects/call.c:410
No locals.
#31 0x00007fa596d896f6 in _PyEval_EvalFrameDefault () at ./Include/cpython/abstract.h:127
No locals.
#32 0x00007fa596c66216 in PyEval_EvalFrameEx () at Python/ceval.c:741
No locals.
#33 function_code_fastcall () at Objects/call.c:283
No locals.
#34 _PyFunction_Vectorcall () at Objects/call.c:410
No locals.
#35 0x00007fa596d896f6 in _PyEval_EvalFrameDefault () at ./Include/cpython/abstract.h:127
--Type <RET> for more, q to quit, c to continue without paging--
No locals.
#36 0x00007fa596d7feab in PyEval_EvalFrameEx () at Python/ceval.c:741
No locals.
#37 _PyEval_EvalCodeWithName () at Python/ceval.c:4298
No locals.
#38 0x00007fa596c6617f in _PyFunction_Vectorcall () at Objects/call.c:435
No locals.
#39 0x00007fa596d896f6 in _PyEval_EvalFrameDefault () at ./Include/cpython/abstract.h:127
No locals.
#40 0x00007fa596c66216 in PyEval_EvalFrameEx () at Python/ceval.c:741
No locals.
#41 function_code_fastcall () at Objects/call.c:283
No locals.
#42 _PyFunction_Vectorcall () at Objects/call.c:410
No locals.
#43 0x00007fa596d896f6 in _PyEval_EvalFrameDefault () at ./Include/cpython/abstract.h:127
No locals.
#44 0x00007fa596c66216 in PyEval_EvalFrameEx () at Python/ceval.c:741
No locals.
#45 function_code_fastcall () at Objects/call.c:283
No locals.
#46 _PyFunction_Vectorcall () at Objects/call.c:410
No locals.
#47 0x00007fa596c65478 in PyVectorcall_Call () at Objects/call.c:199
No locals.
#48 0x00007fa596d89e7a in _PyEval_EvalFrameDefault () at Python/ceval.c:5034
No locals.
#49 0x00007fa596d7feab in PyEval_EvalFrameEx () at Python/ceval.c:741
No locals.
#50 _PyEval_EvalCodeWithName () at Python/ceval.c:4298
No locals.
#51 0x00007fa596c6617f in _PyFunction_Vectorcall () at Objects/call.c:435
No locals.
#52 0x00007fa596d89a17 in _PyEval_EvalFrameDefault () at ./Include/cpython/abstract.h:127
No locals.
#53 0x00007fa596c66216 in PyEval_EvalFrameEx () at Python/ceval.c:741
No locals.
#54 function_code_fastcall () at Objects/call.c:283
No locals.
#55 _PyFunction_Vectorcall () at Objects/call.c:410
No locals.
#56 0x00007fa596d896f6 in _PyEval_EvalFrameDefault () at ./Include/cpython/abstract.h:127
No locals.
#57 0x00007fa596d7feab in PyEval_EvalFrameEx () at Python/ceval.c:741
No locals.
#58 _PyEval_EvalCodeWithName () at Python/ceval.c:4298
No locals.
#59 0x00007fa596c6617f in _PyFunction_Vectorcall () at Objects/call.c:435
No locals.
#60 0x00007fa596c65478 in PyVectorcall_Call () at Objects/call.c:199
--Type <RET> for more, q to quit, c to continue without paging--
No locals.
#61 0x00007fa596d89e7a in _PyEval_EvalFrameDefault () at Python/ceval.c:5034
No locals.
#62 0x00007fa596d7feab in PyEval_EvalFrameEx () at Python/ceval.c:741
No locals.
#63 _PyEval_EvalCodeWithName () at Python/ceval.c:4298
No locals.
#64 0x00007fa596c6617f in _PyFunction_Vectorcall () at Objects/call.c:435
No locals.
#65 0x00007fa596c65478 in PyVectorcall_Call () at Objects/call.c:199
No locals.
#66 0x00007fa596d89e7a in _PyEval_EvalFrameDefault () at Python/ceval.c:5034
No locals.
#67 0x00007fa596d7feab in PyEval_EvalFrameEx () at Python/ceval.c:741
No locals.
#68 _PyEval_EvalCodeWithName () at Python/ceval.c:4298
No locals.
#69 0x00007fa596c6617f in _PyFunction_Vectorcall () at Objects/call.c:435
No locals.
#70 0x00007fa596c646d6 in _PyObject_FastCallDict () at Objects/call.c:96
No locals.
#71 0x00007fa596ce52fd in slot_tp_init () at Objects/call.c:887
No locals.
#72 0x00007fa596ceed24 in type_call () at Objects/typeobject.c:991
No locals.
#73 0x00007fa596c649f7 in _PyObject_MakeTpCall () at Objects/call.c:159
No locals.
#74 0x00007fa596d89b05 in _PyEval_EvalFrameDefault () at ./Include/cpython/abstract.h:125
No locals.
#75 0x00007fa596d7feab in PyEval_EvalFrameEx () at Python/ceval.c:741
No locals.
#76 _PyEval_EvalCodeWithName () at Python/ceval.c:4298
No locals.
#77 0x00007fa596c6617f in _PyFunction_Vectorcall () at Objects/call.c:435
No locals.
#78 0x00007fa596d898af in _PyEval_EvalFrameDefault () at ./Include/cpython/abstract.h:127
No locals.
#79 0x00007fa596c66216 in PyEval_EvalFrameEx () at Python/ceval.c:741
No locals.
#80 function_code_fastcall () at Objects/call.c:283
No locals.
#81 _PyFunction_Vectorcall () at Objects/call.c:410
No locals.
#82 0x00007fa596d89a17 in _PyEval_EvalFrameDefault () at ./Include/cpython/abstract.h:127
No locals.
#83 0x00007fa596c66216 in PyEval_EvalFrameEx () at Python/ceval.c:741
No locals.
#84 function_code_fastcall () at Objects/call.c:283
No locals.
#85 _PyFunction_Vectorcall () at Objects/call.c:410
--Type <RET> for more, q to quit, c to continue without paging--
No locals.
#86 0x00007fa596d89a17 in _PyEval_EvalFrameDefault () at ./Include/cpython/abstract.h:127
No locals.
#87 0x00007fa596d7feab in PyEval_EvalFrameEx () at Python/ceval.c:741
No locals.
#88 _PyEval_EvalCodeWithName () at Python/ceval.c:4298
No locals.
#89 0x00007fa596de7770 in PyEval_EvalCodeEx () at Python/ceval.c:4327
No locals.
#90 PyEval_EvalCode () at Python/ceval.c:718
No locals.
#91 run_eval_code_obj () at Python/pythonrun.c:1125
No locals.
#92 0x00007fa596de76bd in run_mod () at Python/pythonrun.c:1147
No locals.
#93 0x00007fa596de52a7 in PyRun_FileExFlags () at Python/pythonrun.c:1063
No locals.
#94 0x00007fa596de4ae7 in PyRun_SimpleFileExFlags () at Python/pythonrun.c:428
No locals.
#95 0x00007fa596e0e0bc in pymain_run_file () at Modules/main.c:381
No locals.
#96 0x00007fa596e0d606 in pymain_run_python () at Modules/main.c:565
No locals.
#97 0x00007fa596e0d404 in Py_RunMain () at Modules/main.c:644
No locals.
#98 0x00007fa596e0e9fc in pymain_main () at Modules/main.c:674
No locals.
#99 0x00007fa596e0ed17 in Py_BytesMain () at Modules/main.c:698
No locals.
#100 0x00007fa596927e6b in __libc_start_main (main=0x201150 <main>, argc=4, argv=0x7fff27d4ce38, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, 
    stack_end=0x7fff27d4ce28) at ../csu/libc-start.c:308
        self = <optimized out>
        result = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0, -5808915208034097174, 2101248, 140733861645872, 0, 0, 5808756817232357354, 5776353614130145258}, mask_was_saved = 0}}, priv = {
            pad = {0x0, 0x0, 0x7fff27d4ce60, 0x7fa596f44150}, data = {prev = 0x0, cleanup = 0x0, canceltype = 668257888}}}
        not_first_call = <optimized out>
#101 0x000000000020102a in _start () at ../sysdeps/x86_64/start.S:120
No locals.
(gdb) Quit


@coreyfarrell
Copy link
Collaborator

Could you install the debuginfo package for libnftables? This might provide more details. I'm not sure this is a bug in jansson, it could be that libnftables (or some other part of the process) has a buffer overrun and is breaking jansson data structures. I will point to https://coveralls.io/github/akheron/jansson - the coverage for jansson is very high, we even have "chaos" tests which perform brute force malloc failure testing. Also can you confirm you are running jansson v2.12?

@zeha
Copy link

zeha commented Jun 28, 2020

IIRC OpenMandriva enabled verity/cryptsetup support in util-linux/libmount a while ago. Can you check if this problem goes away with verity/cryptsetup disabled?

@mbiebl
Copy link

mbiebl commented Jun 28, 2020

fwiw, @zeha is referring to this downstream bug report in Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=963721#59 where we ran into "weird" issues, once libmount was compiled with cryptsetup/verity support.
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=963721#66 has a backtrace. Since I can easily reproduce the issue, I can run further diagnostics.

@zeha
Copy link

zeha commented Jun 28, 2020

It turns out, this particular crash is caused by both jansson and json-c exporting a symbol named json_object_iter_next, obviously incompatible.

In Debian, this is now tracked as https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=963932 (a clone of 963721, which was actually unrelated).

@zeha
Copy link

zeha commented Jun 28, 2020

Also see json-c/json-c#621

@smcv
Copy link
Contributor

smcv commented Jun 29, 2020

The possible ways to resolve this conflict go something like this:

  • json-c renames its symbols to something like jsonc_whatever(), which is an API and ABI break, and presumably the json-c maintainers would prefer not to do it
  • jansson renames its symbols to something like jansson_whatever(), which is an API and ABI break, and presumably the jansson maintainers would prefer not to do it
  • both libraries rename their symbols, as above
  • both libraries use versioned symbols, at least on Linux, and then distributions rebuild dependent binaries against the release that has versioned symbols

Using versioned symbols would mean that:

  • when dependent binaries that were compiled against a new enough shared json-c call json_object_get(), it's really a reference to json_object_get@JSON_C or similar;
  • when dependent binaries that were compiled against a new enough shared jansson call json_object_get(), it's really a reference to json_object_get@JANSSON or similar;
  • at runtime, ld.so chooses the right implementation of json_object_get in each case;
  • dependent binaries that have not yet been recompiled against a new enough json-c or jansson still have a reference to plain json_object_get, which can be satisfied by either of the libraries, so they will continue to get the wrong one and crash about half the time
  • dependent binaries that were statically linked to either json-c or jansson still have a reference to plain json_object_get, and will potentially still crash unless they are linked with -Wl,-Bsymbolic

None of these solutions are perfect: renaming symbols is an ABI break, while adding versioned symbols requires action from the maintainers of both libraries.

If the upstream maintainers of these libraries don't take action, it is possible that downstream maintainers in Linux distributions will decide to add versioned symbols without coordination with upstream. This is usually a bad idea and leads to binary-compatibility problems in the future, and I would recommend that downstream maintainers shouldn't do that, but they might consider it to be a necessary evil to fix crashes.

@mbiebl
Copy link

mbiebl commented Jun 29, 2020

apparently, json-glib also uses the json_ namespace and has conflicting symbols. See https://gitlab.gnome.org/GNOME/json-glib/-/issues/33

@smcv
Copy link
Contributor

smcv commented Jun 29, 2020

The easiest way to have versioned symbols is: on supported platforms (at least GNU/Linux) and with supported compilers (at least gcc and clang), link the shared library with -Wl,--default-symver. This decorates every symbol with a symbol-version that is identical to the SONAME. For example, json_object_get becomes json_object_get@libjansson.so.4.

This simple form of versioned symbols is more or less equivalent to -Wl,--version-script=jansson.ver where jansson.ver has these contents:

libjansson.so.4 {
  global:
    *;
};

Either of those is enough to make newly-compiled libjansson-dependent objects find the libjansson version of a function in preference to other versions. It will not prevent json-glib-dependent or json-c-dependent objects from crashing when they find libjansson's version of a symbol instead of the one they wanted: to prevent that, json-glib and json-c would need to adopt versioned symbols as well.

Adding versioned symbols is a compatible change and does not need a SONAME bump, but if you adopt versioned symbols, they will become part of the library's ABI: removing or changing them is an incompatible change which requires a SONAME bump.

It is also possible to use more complicated version scripts that assign versions to individual symbols, as is done in projects like OpenSSL, util-linux/libmount/libblkid and libgcab. This requires maintaining or generating a detailed list of symbols, and can help downstream projects to track what the library's dependencies are, but is not necessary if all you want to achieve is to resolve the symbol clash between json-c, json-glib and libjansson.

smcv added a commit to smcv/jansson that referenced this issue Jun 29, 2020
The --default-symver linker option attaches a default version definition
(the SONAME) to every exported symbol. It is supported since at least
GNU binutils 2.22 in 2011 (older versions not tested).

With this version definition, newly-linked binaries that depend on the
jansson shared library will refer to its symbols in a versioned form,
preventing their references from being resolved to a symbol of the same
name exported by json-c or json-glib if those libraries appear in
dependency search order before jansson, which will usually result in
a crash. This is necessary because ELF symbol resolution normally uses
a single flat namespace, not a tree like Windows symbol resolution.
At least one symbol (json_object_iter_next()) is exported by all three
JSON libraries.

Linking with -Bsymbolic is not enough to have this effect in all cases,
because -Bsymbolic only affects symbol lookup within a shared object,
for example when parse_json() calls json_decref(). It does not affect
calls from external code into jansson, unless jansson was statically
linked into the external caller.

This change will also not prevent code that depends on json-c or
json-glib from finding jansson's symbols and crashing; to prevent
that, a corresponding change in json-c or json-glib would be needed.

Adding a symbol-version is a backwards-compatible change, but once
added, removing or changing the symbol-version would be an incompatible
change that requires a SONAME bump.

Resolves: akheron#523
(when combined with an equivalent change to json-c).

Signed-off-by: Simon McVittie <smcv@collabora.com>
@akheron
Copy link
Owner

akheron commented Jun 29, 2020

Supporting Versioned symbols sounds good to me. Problems with clashing symbol names between JSON libraries are reported every now and then, and this would seem to help with those problems.

smcv added a commit to smcv/jansson that referenced this issue Jul 2, 2020
The --default-symver linker option attaches a default version definition
(the SONAME) to every exported symbol. It is supported since at least
GNU binutils 2.22 in 2011 (older versions not tested).

With this version definition, newly-linked binaries that depend on the
jansson shared library will refer to its symbols in a versioned form,
preventing their references from being resolved to a symbol of the same
name exported by json-c or json-glib if those libraries appear in
dependency search order before jansson, which will usually result in
a crash. This is necessary because ELF symbol resolution normally uses
a single flat namespace, not a tree like Windows symbol resolution.
At least one symbol (json_object_iter_next()) is exported by all three
JSON libraries.

Linking with -Bsymbolic is not enough to have this effect in all cases,
because -Bsymbolic only affects symbol lookup within a shared object,
for example when parse_json() calls json_decref(). It does not affect
calls from external code into jansson, unless jansson was statically
linked into the external caller.

This change will also not prevent code that depends on json-c or
json-glib from finding jansson's symbols and crashing; to prevent
that, a corresponding change in json-c or json-glib would be needed.

Adding a symbol-version is a backwards-compatible change, but once
added, removing or changing the symbol-version would be an incompatible
change that requires a SONAME bump.

Resolves: akheron#523
(when combined with an equivalent change to json-c).

Signed-off-by: Simon McVittie <smcv@collabora.com>
JingweiLiuis added a commit to JingweiLiuis/jansson that referenced this issue Nov 14, 2022
The --default-symver linker option attaches a default version definition
(the SONAME) to every exported symbol. It is supported since at least
GNU binutils 2.22 in 2011 (older versions not tested).

With this version definition, newly-linked binaries that depend on the
jansson shared library will refer to its symbols in a versioned form,
preventing their references from being resolved to a symbol of the same
name exported by json-c or json-glib if those libraries appear in
dependency search order before jansson, which will usually result in
a crash. This is necessary because ELF symbol resolution normally uses
a single flat namespace, not a tree like Windows symbol resolution.
At least one symbol (json_object_iter_next()) is exported by all three
JSON libraries.

Linking with -Bsymbolic is not enough to have this effect in all cases,
because -Bsymbolic only affects symbol lookup within a shared object,
for example when parse_json() calls json_decref(). It does not affect
calls from external code into jansson, unless jansson was statically
linked into the external caller.

This change will also not prevent code that depends on json-c or
json-glib from finding jansson's symbols and crashing; to prevent
that, a corresponding change in json-c or json-glib would be needed.

Adding a symbol-version is a backwards-compatible change, but once
added, removing or changing the symbol-version would be an incompatible
change that requires a SONAME bump.

Resolves: akheron/jansson#523
(when combined with an equivalent change to json-c).

Signed-off-by: Simon McVittie <smcv@collabora.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants