Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault when using synchronous redis requests #2789

Closed
membero opened this Issue Mar 13, 2019 · 2 comments

Comments

Projects
None yet
3 participants
@membero
Copy link

membero commented Mar 13, 2019

Classification (Please choose one option):

  • Crash/Hang/Data loss
  • WebUI/Usability
  • Unintended behaviour (bug)
  • Enhancement

Reproducibility (Please choose one option):

  • Always
  • Sometimes
  • Rarely
  • Unable
  • I didn’t try
  • Not applicable

Rspamd version:

1.9.0 and 1.9.1

I did install them from your repository. Actually I tried rspamd_1.9.1-0git1f4268b3~stretch_amd64

Operation system (the exact version), CPU (including architecture and processor family):

Debian Stretch

Description (Please provide a descriptive summary of the issue):

I use some own modules which load settings from redis. This is done with synchronous requests. I used documentation from https://rspamd.com/doc/lua/sync_async.html

Segfault occurs when doing connection:exec().

This happens also if I just add the example plugin from https://rspamd.com/doc/lua/sync_async.html (last example from page).

Compile errors (if any):

Relevant logs (see details here):

2019-03-13 11:49:42 #31627(normal) <45f89b>; lua; test.lua:34: connect: true, rspamd{redis}(000000004036EC48)
2019-03-13 11:49:42 #31627(normal) <45f89b>; lua; test.lua:46: add_cmd: true, nil
2019-03-13 11:49:42 #31627(normal) rspamd_crash_sig_handler: caught fatal signal 11(Segmentation fault), pid: 31627, trace:
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 0: 00007F5516E8E0DA: lua_thread_yield_full()+0xa
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 1: 00007F55160F9106: _init()+0x25f6
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 2: 00007F551613E9E0: lua_pcall()+0x30
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 3: 00007F5516F193F7: lua_metric_symbol_callback()+0xb7
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 4: 00007F5516E3C489: rspamd_symcache_check_symbol.isra.4.lto_priv.181()+0
x209
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 5: 00007F5516E3D6C8: rspamd_symcache_process_symbols()+0x658
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 6: 00007F5516E3DC0B: rspamd_task_process()+0x41b
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 7: 00007F5516E3DA41: rspamd_task_process()+0x251
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 8: 00007F5516E3DA41: rspamd_task_process()+0x251
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 9: 00007F5516E3DD74: rspamd_task_fin()+0x14
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 10: 00007F5516EAEE05: rspamd_session_pending()+0x55
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 11: 00007F5516EB78E2: rspamd_session_remove_event_full()+0x222
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 12: 00007F5516E5C2AC: rdns_process_read()+0x3ec
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 13: 00007F5516B105A0: event_base_loop()+0x6a0
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 14: 000055A6E8D32387: start_worker()+0x117
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 15: 00007F5516EC8170: rspamd_fork_worker()+0x2b0
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 16: 000055A6E8D2A6E3: spawn_worker_type.lto_priv.165()+0xa3
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 17: 000055A6E8D2E3AE: spawn_workers()+0xae
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 18: 000055A6E8D0D45A: main()+0x9fa
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 19: 00007F55179E92E1: __libc_start_main()+0xf1
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 20: 000055A6E8D0D9AA: _start()+0x2a
2019-03-13 11:49:42 #31627(normal) rspamd_print_crash: 21: 0000000000000000: <unknown>

Expected results:

No segfaults.

Actual results:

Segfaults occur whenever an email arrives.

Debugging information (see details here):

GNU gdb (Debian 7.12-6) 7.12.0.20161007-git
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/rspamd...done.
[New LWP 31626]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `rspamd: normal process (localhost:11333)     '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  lua_thread_yield_full (thread_entry=<optimized out>, nresults=<optimized out>, loc=<optimized out>)
    at ./src/lua/lua_thread_pool.c:336
336     ./src/lua/lua_thread_pool.c: No such file or directory.
(gdb) bt full
#0  lua_thread_yield_full (thread_entry=<optimized out>, nresults=<optimized out>, loc=<optimized out>) at ./src/lua/lua_thread_pool.c:336
        _g_boolean_var_ = <optimized out>
#1  0x00007f55160f9106 in ?? () from /usr/lib/x86_64-linux-gnu/libluajit-5.1.so.2
No symbol table info available.
#2  0x00007f551613e9e0 in lua_pcall () from /usr/lib/x86_64-linux-gnu/libluajit-5.1.so.2
No symbol table info available.
#3  0x00007f5516f193f7 in lua_metric_symbol_callback (task=0x55a6eae32cd0, item=<optimized out>, ud=0x7f5510fdc5f8) at ./src/lua/lua_config.c:1146
        cd = 0x7f5510fdc5f8
        ptask = 0x40e20828
        level = 1
        nresults = <optimized out>
        err_idx = 1
        ret = <optimized out>
        L = 0x407f3378
        tb = <optimized out>
        s = <optimized out>
        __func__ = "lua_metric_symbol_callback"
#4  0x00007f5516e3c489 in rspamd_symcache_check_symbol.isra.4.lto_priv.181 (task=task@entry=0x55a6eae32cd0, item=item@entry=0x55a6ea9d3540, checkpoint=checkpoint@entry=0x55a6eb4c2458) at ./src/libserver/rspamd_symcache.c:1434
        t1 = <optimized out>
        ptask = <optimized out>
        L = <optimized out>
        check = <optimized out>
#5  0x00007f5516e3d6c8 in rspamd_symcache_process_symbols (task=task@entry=0x55a6eae32cd0, cache=0x55a6e9c9c730, stage=stage@entry=32) at ./src/libserver/rspamd_symcache.c:1803
        item = 0x55a6ea9d3540
        checkpoint = 0x55a6eb4c2458
        i = 135
        all_done = 0
        saved_priority = <optimized out>
        start_events_pending = <optimized out>
        __func__ = "rspamd_symcache_process_symbols"
#6  0x00007f5516e3dc0b in rspamd_task_process (task=task@entry=0x55a6eae32cd0, stages=stages@entry=131071) at ./src/libserver/task.c:778
        ret = 1
        stat_error = 0x0
        __func__ = "rspamd_task_process"
#7  0x00007f5516e3da41 in rspamd_task_process (task=task@entry=0x55a6eae32cd0, stages=stages@entry=131071) at ./src/libserver/task.c:898
        ret = <optimized out>
        stat_error = 0x0
        __func__ = "rspamd_task_process"
#8  0x00007f5516e3da41 in rspamd_task_process (task=task@entry=0x55a6eae32cd0, stages=stages@entry=131071) at ./src/libserver/task.c:898
        ret = <optimized out>
        stat_error = 0x0
        __func__ = "rspamd_task_process"
#9  0x00007f5516e3dd74 in rspamd_task_fin (arg=<optimized out>, arg=<optimized out>) at ./src/libserver/task.c:203
        task = 0x55a6eae32cd0
#10 0x00007f5516eaee05 in rspamd_session_pending (session=<optimized out>, session=<optimized out>) at ./src/libserver/events.c:305
        __func__ = "rspamd_session_pending"
#11 0x00007f5516eb78e2 in rspamd_session_remove_event_full (session=<optimized out>, fin=<optimized out>, ud=<optimized out>, loc=<optimized out>, loc=<optimized out>, ud=<optimized out>, fin=<optimized out>, session=<optimized out>)
    at ./src/libserver/events.c:245
        search_ev = {subsystem = 0x1 <error: Cannot access memory at address 0x1>, loc = 0x55a6eb4d14d0 "P\032M\353\246U", fin = 0x7f5516e73000 <rspamd_dns_fin_cb>, user_data = 0x55a6eb4c2c60}
        found_ev = <optimized out>
        loc = 0x7f55176482c8 "/rspamd-1.9.1/src/libserver/dns.c:108"
        ud = 0x55a6eb4c2c60
        fin = 0x7f5516e73000 <rspamd_dns_fin_cb>
        session = 0x55a6eb4c10d0
        __func__ = "rspamd_session_remove_event_full"
#12 0x00007f5516e5c2ac in rdns_process_read (fd=<optimized out>, arg=<optimized out>) at ./contrib/librdns/resolver.c:281
        ioc = <optimized out>
        resolver = <optimized out>
        req = 0x55a6eb4d1570
        r = <optimized out>
        rep = 0x55a6eb4d1a20
        in = "XXX"...
---Type <return> to continue, or q <return> to quit---
#13 0x00007f5516b105a0 in event_base_loop () from /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5
No symbol table info available.
#14 0x000055a6e8d32387 in start_worker (worker=0x55a6eaeb4a10) at ./src/worker.c:702
        ctx = 0x7f55108052e8
        http_ctx = <optimized out>
#15 0x00007f5516ec8170 in rspamd_fork_worker (rspamd_main=0x55a6e9c98580, cf=0x55a6e9d44ed0, index=0, ev_base=<optimized out>) at ./src/libserver/worker_util.c:685
        wrk = 0x55a6eaeb4a10
        rc = <optimized out>
        rlim = {rlim_cur = 104857600, rlim_max = 104857600}
        __func__ = "rspamd_fork_worker"
#16 0x000055a6e8d2a6e3 in spawn_worker_type.lto_priv.165 (rspamd_main=0x55a6e9c98580, ev_base=0x55a6eaeb39e0, cf=0x55a6e9d44ed0) at ./src/rspamd.c:543
        i = 0
        __func__ = "spawn_worker_type"
#17 0x000055a6e8d2e3ae in spawn_workers (rspamd_main=0x55a6e9c98580, ev_base=0x55a6eaeb39e0) at ./src/rspamd.c:636
        cur = <optimized out>
        ls = <optimized out>
        cf = <optimized out>
        p = <optimized out>
        bcf = <optimized out>
        listen_ok = <optimized out>
        seen_mandatory_workers = 0x55a6eae5bfa0
        cw = <optimized out>
        wrk = <optimized out>
        i = <optimized out>
        __func__ = "spawn_workers"
#18 0x000055a6e8d0d45a in main (argc=<optimized out>, argv=<optimized out>, env=<optimized out>) at ./src/rspamd.c:1489
        i = <optimized out>
        res = 0
        signals = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0 <repeats 16 times>}}, sa_flags = 0, sa_restorer = 0x0}
        sigpipe_act = {__sigaction_handler = {sa_handler = 0x1, sa_sigaction = 0x1}, sa_mask = {__val = {4096, 0 <repeats 15 times>}}, sa_flags = 0, sa_restorer = 0x19}
        pworker = <optimized out>
        type = <optimized out>
        control_addr = 0x55a6eae0ddd0
        ev_base = 0x55a6eaeb39e0
        term_ev = {ev_active_next = {tqe_next = 0x0, tqe_prev = 0x0}, ev_next = {tqe_next = 0x7ffebdb588f0, tqe_prev = 0x55a6eaeb3b40}, ev_timeout_pos = {ev_next_with_common_timeout = {tqe_next = 0xffffffff, tqe_prev = 0x0},
            min_heap_idx = -1}, ev_fd = 15, ev_base = 0x55a6eaeb39e0, _ev = {ev_io = {ev_io_next = {tqe_next = 0x0, tqe_prev = 0x55a6eaeb4310}, ev_timeout = {tv_sec = 723401728380764160, tv_usec = 0}}, ev_signal = {ev_signal_next = {
                tqe_next = 0x0, tqe_prev = 0x55a6eaeb4310}, ev_ncalls = 0, ev_pncalls = 0x0}}, ev_events = 24, ev_res = 0, ev_flags = 128, ev_pri = 0 '\000', ev_closure = 1 '\001', ev_timeout = {tv_sec = 0, tv_usec = 0},
          ev_callback = 0x55a6e8d2d5c0 <rspamd_term_handler>, ev_arg = 0x55a6e9c98580}
        int_ev = {ev_active_next = {tqe_next = 0x7ffebdb58c90, tqe_prev = 0x7ffebdb58c68}, ev_next = {tqe_next = 0x7ffebdb587d0, tqe_prev = 0x55a6eaeb3b40}, ev_timeout_pos = {ev_next_with_common_timeout = {tqe_next = 0x7f55ffffffff,
              tqe_prev = 0x0}, min_heap_idx = -1}, ev_fd = 2, ev_base = 0x55a6eaeb39e0, _ev = {ev_io = {ev_io_next = {tqe_next = 0x0, tqe_prev = 0x55a6eaeb4470}, ev_timeout = {tv_sec = 0, tv_usec = 0}}, ev_signal = {ev_signal_next = {
                tqe_next = 0x0, tqe_prev = 0x55a6eaeb4470}, ev_ncalls = 0, ev_pncalls = 0x0}}, ev_events = 24, ev_res = 0, ev_flags = 128, ev_pri = 0 '\000', ev_closure = 1 '\001', ev_timeout = {tv_sec = 0, tv_usec = 0},
          ev_callback = 0x55a6e8d2d5c0 <rspamd_term_handler>, ev_arg = 0x55a6e9c98580}
        cld_ev = {ev_active_next = {tqe_next = 0x7f5517f824e8, tqe_prev = 0x7ffebdb58880}, ev_next = {tqe_next = 0x7ffebdb58740, tqe_prev = 0x55a6eaeb3b40}, ev_timeout_pos = {ev_next_with_common_timeout = {tqe_next = 0x55a6ffffffff,
              tqe_prev = 0x7ffebdb58c90}, min_heap_idx = -1}, ev_fd = 17, ev_base = 0x55a6eaeb39e0, _ev = {ev_io = {ev_io_next = {tqe_next = 0x0, tqe_prev = 0x55a6eaeb4250}, ev_timeout = {tv_sec = 0, tv_usec = 0}}, ev_signal = {
              ev_signal_next = {tqe_next = 0x0, tqe_prev = 0x55a6eaeb4250}, ev_ncalls = 0, ev_pncalls = 0x0}}, ev_events = 24, ev_res = 0, ev_flags = 128, ev_pri = 0 '\000', ev_closure = 1 '\001', ev_timeout = {
            tv_sec = 140003432031224, tv_usec = 140732081212192}, ev_callback = 0x55a6e8d2ded0 <rspamd_cld_handler>, ev_arg = 0x55a6e9c98580}
        hup_ev = {ev_active_next = {tqe_next = 0x7ffebdb58810, tqe_prev = 0x117f7afd8}, ev_next = {tqe_next = 0x7ffebdb58860, tqe_prev = 0x55a6eaeb3b40}, ev_timeout_pos = {ev_next_with_common_timeout = {tqe_next = 0x55a6ffffffff,
              tqe_prev = 0xb9956ede}, min_heap_idx = -1}, ev_fd = 1, ev_base = 0x55a6eaeb39e0, _ev = {ev_io = {ev_io_next = {tqe_next = 0x0, tqe_prev = 0x55a6eaeb4190}, ev_timeout = {tv_sec = 0, tv_usec = 0}}, ev_signal = {
              ev_signal_next = {tqe_next = 0x0, tqe_prev = 0x55a6eaeb4190}, ev_ncalls = 0, ev_pncalls = 0x0}}, ev_events = 24, ev_res = 0, ev_flags = 128, ev_pri = 0 '\000', ev_closure = 1 '\001', ev_timeout = {tv_sec = 226670640,
            tv_usec = 0}, ev_callback = 0x55a6e8d2e9e0 <rspamd_hup_handler>, ev_arg = 0x55a6e9c98580}
        usr1_ev = {ev_active_next = {tqe_next = 0x7f5516d6b321, tqe_prev = 0x7f5517f8d4c8}, ev_next = {tqe_next = 0x55a6eaeb4688, tqe_prev = 0x55a6eaeb3b40}, ev_timeout_pos = {ev_next_with_common_timeout = {tqe_next = 0xffffffff,
              tqe_prev = 0x0}, min_heap_idx = -1}, ev_fd = 10, ev_base = 0x55a6eaeb39e0, _ev = {ev_io = {ev_io_next = {tqe_next = 0x0, tqe_prev = 0x55a6eaeb4950}, ev_timeout = {tv_sec = 0, tv_usec = 0}}, ev_signal = {
              ev_signal_next = {tqe_next = 0x0, tqe_prev = 0x55a6eaeb4950}, ev_ncalls = 0, ev_pncalls = 0x0}}, ev_events = 24, ev_res = 0, ev_flags = 128, ev_pri = 0 '\000', ev_closure = 1 '\001', ev_timeout = {
            tv_sec = 140003451130224, tv_usec = 3182790920}, ev_callback = 0x55a6e8d2d620 <rspamd_usr1_handler>, ev_arg = 0x55a6e9c98580}
        control_ev = {ev_active_next = {tqe_next = 0x7ffebdb58734, tqe_prev = 0x7f55179ccd90}, ev_next = {tqe_next = 0x55a6e8d013a0, tqe_prev = 0x55a6e8cfabd0}, ev_timeout_pos = {ev_next_with_common_timeout = {
              tqe_next = 0x7ffebdb58738, tqe_prev = 0xb9956ede}, min_heap_idx = -1112176840}, ev_fd = 48649659, ev_base = 0x7f5516d62638, _ev = {ev_io = {ev_io_next = {tqe_next = 0x7ffebdb58810, tqe_prev = 0x7f5516d4e3a0},
              ev_timeout = {tv_sec = 140732081211188, tv_usec = 140732081211392}}, ev_signal = {ev_signal_next = {tqe_next = 0x7ffebdb58810, tqe_prev = 0x7f5516d4e3a0}, ev_ncalls = -30924, ev_pncalls = 0x7ffebdb58800}},
          ev_events = -22312, ev_res = 6135, ev_flags = 32597, ev_pri = 0 '\000', ev_closure = 0 '\000', ev_timeout = {tv_sec = 26, tv_usec = 402137088}, ev_callback = 0x7ffebdb587b8, ev_arg = 0xbdb587b4}
        term_tv = {tv_sec = 140003451086056, tv_usec = 140732081211192}
        skip_pid = 1
        valgrind_mode = 0
        __func__ = "main"

Configuration (e.g. rspamadm configdump module):

Additional information:

When I deactivate the module, everything is fine.

@rm-minus-rf

This comment has been minimized.

Copy link

rm-minus-rf commented Mar 18, 2019

We had the same problem while upgrading from 1.8.3 to 1.9.0.
We got rid of the segfaults by adding the 'coro' flag when registering the symbol:

rspamd_config:register_symbol({
  name = 'REDIS_TEST',
  score = 1.0,
  callback = redis_symbol,
  no_squeeze = true,
  flags = 'coro'
})

I don't know if the 'coro' flag is documented somewhere, i just saw it in a commit and tried it...

@vstakhov

This comment has been minimized.

Copy link
Member

vstakhov commented Mar 18, 2019

Thank you for your report, I have updated the documentation and the Upgrading page.

@vstakhov vstakhov closed this Mar 18, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.