Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] rspamd 3.4 triggering unaligned memory access in hyperscan #4329

Closed
darix opened this issue Nov 8, 2022 · 29 comments
Closed

[BUG] rspamd 3.4 triggering unaligned memory access in hyperscan #4329

darix opened this issue Nov 8, 2022 · 29 comments
Labels

Comments

@darix
Copy link

darix commented Nov 8, 2022

#0  _mm256_shuffle_epi8 (__Y=..., __X=...) at /usr/lib64/gcc/x86_64-suse-linux/12/include/avx2intrin.h:789
#1  pshufb_m256 (b=..., a=...) at /usr/src/debug/hyperscan-5.4.0-2.1.x86_64/src/util/simd_utils.h:340
#2  prep_conf_fat_teddy_m1 (val=..., maskBase=<optimized out>) at /usr/src/debug/hyperscan-5.4.0-2.1.x86_64/src/fdr/teddy_avx2.c:512
#3  prep_conf_fat_teddy_m2 (val=..., old_1=<optimized out>, maskBase=<optimized out>) at /usr/src/debug/hyperscan-5.4.0-2.1.x86_64/src/fdr/teddy_avx2.c:521
#4  prep_conf_fat_teddy_m3 (val=..., old_2=<synthetic pointer>, old_1=<synthetic pointer>, maskBase=<optimized out>) at /usr/src/debug/hyperscan-5.4.0-2.1.x86_64/src/fdr/teddy_avx2.c:536
#5  fdr_exec_fat_teddy_msks3 (fdr=<optimized out>, a=0x7ffe37d4bb30, control=71) at /usr/src/debug/hyperscan-5.4.0-2.1.x86_64/src/fdr/teddy_avx2.c:688
#6  0x00007f9ba7734ab4 in fdrExec (fdr=<optimized out>, buf=<optimized out>, len=<optimized out>, start=<optimized out>, cb=<optimized out>, scratch=<optimized out>, groups=71) at /usr/src/debug/hyperscan-5.4.0-2.1.x86_64/src/fdr/fdr.c:849
#7  0x00007f9ba77ea0ca in roseBlockFloating (scratch=0x5571406b5080, t=0x7f9ba2556030) at /usr/src/debug/hyperscan-5.4.0-2.1.x86_64/src/rose/block.c:259
#8  roseBlockExec (t=<optimized out>, scratch=<optimized out>) at /usr/src/debug/hyperscan-5.4.0-2.1.x86_64/src/rose/block.c:398
#9  0x00007f9ba7726f98 in rawBlockExec (scratch=0x5571406b5080, rose=0x7f9ba2556030) at /usr/src/debug/hyperscan-5.4.0-2.1.x86_64/src/runtime.c:188
#10 hs_scan (db=<optimized out>, data=<optimized out>, length=173, flags=<optimized out>, scratch=0x5571406b5080, onEvent=<optimized out>, userCtx=<optimized out>) at /usr/src/debug/hyperscan-5.4.0-2.1.x86_64/src/runtime.c:419
#11 0x00007f9ba8136d00 in rspamd_re_cache_process_regexp_data (rt=rt@entry=0x5571408fef40, re=re@entry=0x55713fd4e7d0, task=task@entry=0x5571408fdf30, in=in@entry=0x5571408f3280, lens=lens@entry=0x5571408ff0f0, count=count@entry=7, is_raw=0, processed_hyperscan=0x7ffe37d4bec8)
    at /usr/src/debug/rspamd-3.4/src/libserver/re_cache.c:813
#12 0x00007f9ba8136f0b in rspamd_re_cache_process_headers_list (task=0x5571408fdf30, rt=0x5571408fef40, re=0x55713fd4e7d0, re_class=0x55713fd4e980, rh=<optimized out>, is_strong=0, processed_hyperscan=0x7ffe37d4bec8) at /usr/src/debug/rspamd-3.4/src/libserver/re_cache.c:1092
[snip: i can provide the full stacktrace if needed]

Steps to Reproduce

  1. upgrade rspamd to 3.4
  2. observe crashes

Expected behavior

do not crash.

Versions

OS: openSUSE Tumbleweed
Package Build description: https://build.opensuse.org/package/show/server:mail/rspamd

Additional Information

  • Downgrading to 3.3 without changing anything else fixes the issue.
  • we looked into the code and it seems one of the data entries is 16 bit aligned and should be 32bit aligned.
@darix darix added the bug label Nov 8, 2022
@vstakhov
Copy link
Member

vstakhov commented Nov 8, 2022

  • we looked into the code and it seems one of the data entries is 16 bit aligned and should be 32bit aligned.

Could you please elaborate more on that?

@darix
Copy link
Author

darix commented Nov 8, 2022

we ran disassemble on the code to see which params were used for the crashing move operation and one of them was only 16bit aligned. but i lost my backtraces and need to reproduce it again.

upgraded the package again to get new coredumps.

@vstakhov
Copy link
Member

vstakhov commented Nov 8, 2022

I mean which pointer is not properly aligned? Hypescan database? I see hs_scan (db=<optimized out> but there is no address. The address should be in the previous frame: p *(rspamd::util::hs_shared_database *)re_class->hs_db)

@darix
Copy link
Author

darix commented Nov 8, 2022

bt
#0  _mm256_shuffle_epi8 (__Y=..., __X=...) at /usr/lib64/gcc/x86_64-suse-linux/12/include/avx2intrin.h:789
#1  pshufb_m256 (b=..., a=...) at /usr/src/debug/hyperscan-5.4.0-2.1.x86_64/src/util/simd_utils.h:340
#2  prep_conf_fat_teddy_m1 (val=..., maskBase=<optimized out>) at /usr/src/debug/hyperscan-5.4.0-2.1.x86_64/src/fdr/teddy_avx2.c:512
#3  prep_conf_fat_teddy_m2 (val=..., old_1=<optimized out>, maskBase=<optimized out>) at /usr/src/debug/hyperscan-5.4.0-2.1.x86_64/src/fdr/teddy_avx2.c:521
#4  prep_conf_fat_teddy_m3 (val=..., old_2=<synthetic pointer>, old_1=<synthetic pointer>, maskBase=<optimized out>) at /usr/src/debug/hyperscan-5.4.0-2.1.x86_64/src/fdr/teddy_avx2.c:536
#5  fdr_exec_fat_teddy_msks3 (fdr=<optimized out>, a=0x7ffc8ecf9890, control=1) at /usr/src/debug/hyperscan-5.4.0-2.1.x86_64/src/fdr/teddy_avx2.c:688
#6  0x00007fd9ab534ab4 in fdrExec (fdr=<optimized out>, buf=<optimized out>, len=<optimized out>, start=<optimized out>, cb=<optimized out>, scratch=<optimized out>, groups=1) at /usr/src/debug/hyperscan-5.4.0-2.1.x86_64/src/fdr/fdr.c:849
#7  0x00007fd9ab5ea0ca in roseBlockFloating (scratch=0x5610ff8e7a00, t=0x7fd9a6351030) at /usr/src/debug/hyperscan-5.4.0-2.1.x86_64/src/rose/block.c:259
#8  roseBlockExec (t=<optimized out>, scratch=<optimized out>) at /usr/src/debug/hyperscan-5.4.0-2.1.x86_64/src/rose/block.c:398
#9  0x00007fd9ab526f98 in rawBlockExec (scratch=0x5610ff8e7a00, rose=0x7fd9a6351030) at /usr/src/debug/hyperscan-5.4.0-2.1.x86_64/src/runtime.c:188
#10 hs_scan (db=<optimized out>, data=<optimized out>, length=98, flags=<optimized out>, scratch=0x5610ff8e7a00, onEvent=<optimized out>, userCtx=<optimized out>) at /usr/src/debug/hyperscan-5.4.0-2.1.x86_64/src/runtime.c:419
#11 0x00007fd9abf36d00 in rspamd_re_cache_process_regexp_data (rt=rt@entry=0x5610ff910ab0, re=re@entry=0x5610fe6f33e0, task=task@entry=0x5610ff90faa0, in=in@entry=0x5610ff913020, lens=lens@entry=0x5610ff913040, count=count@entry=1, is_raw=0, 
    processed_hyperscan=0x7ffc8ecf9c28) at /usr/src/debug/rspamd-3.4/src/libserver/re_cache.c:813
#12 0x00007fd9abf36f0b in rspamd_re_cache_process_headers_list (task=0x5610ff90faa0, rt=0x5610ff910ab0, re=0x5610fe6f33e0, re_class=0x5610feac2790, rh=<optimized out>, is_strong=0, processed_hyperscan=0x7ffc8ecf9c28)
    at /usr/src/debug/rspamd-3.4/src/libserver/re_cache.c:1092
#13 0x00007fd9abf3857a in rspamd_re_cache_exec_re (is_strong=0, re_class=0x5610feac2790, re=0x5610fe6f33e0, rt=0x5610ff910ab0, task=0x5610ff90faa0) at /usr/src/debug/rspamd-3.4/src/libserver/re_cache.c:1142
#14 rspamd_re_cache_process (task=0x5610ff90faa0, re=0x5610fe6f33e0, type=<optimized out>, type_data=<optimized out>, datalen=<optimized out>, is_strong=0) at /usr/src/debug/rspamd-3.4/src/libserver/re_cache.c:1525
#15 0x00007fd9abfdd93b in rspamd_mime_expr_process_regexp (task=0x5610ff90faa0, re=0x7fd9a689c730) at /usr/src/debug/rspamd-3.4/src/libmime/mime_expressions.c:1038
#16 rspamd_mime_expr_process (ud=0x5610ff90faa0, atom=<optimized out>) at /usr/src/debug/rspamd-3.4/src/libmime/mime_expressions.c:1150
#17 0x00007fd9ac0cfd61 in rspamd_ast_process_node.isra.0 (node=<optimized out>, process_data=process_data@entry=0x7ffc8ecf9d10, e=<optimized out>) at /usr/src/debug/rspamd-3.4/src/libutil/expression.c:1357
#18 0x00007fd9abed10aa in rspamd_process_expression_closure (expr=0x5610fe6e4f40, cb=<optimized out>, flags=<optimized out>, runtime_ud=<optimized out>, track=0x0) at /usr/src/debug/rspamd-3.4/src/libutil/expression.c:1483
#19 0x00007fd9ac04770f in process_regexp_item (task=0x5610ff90faa0, symcache_item=0x5610ff911700, user_data=0x7fd9a689c5f0) at /usr/src/debug/rspamd-3.4/src/plugins/regexp.c:552
#20 0x00007fd9abf5975c in rspamd::symcache::normal_item::call (item=0x5610ff911700, task=0x5610ff90faa0, this=0x5610fe88f1d8) at /usr/src/debug/rspamd-3.4/src/libserver/symcache/symcache_item.hxx:129
#21 rspamd::symcache::cache_item::call (dyn_item=0x5610ff911700, task=0x5610ff90faa0, this=0x5610fe88f170) at /usr/src/debug/rspamd-3.4/src/libserver/symcache/symcache_item.hxx:405
#22 rspamd::symcache::cache_item::call (dyn_item=0x5610ff911700, task=0x5610ff90faa0, this=0x5610fe88f170) at /usr/src/debug/rspamd-3.4/src/libserver/symcache/symcache_item.hxx:401
#23 rspamd::symcache::symcache_runtime::process_symbol (this=0x5610ff911320, task=0x5610ff90faa0, cache=..., item=0x5610fe88f170, dyn_item=0x5610ff911700) at /usr/src/debug/rspamd-3.4/src/libserver/symcache/symcache_runtime.cxx:493
#24 0x00007fd9ac0b5cda in rspamd::symcache::symcache_runtime::check_item_deps(rspamd_task*, rspamd::symcache::symcache&, rspamd::symcache::cache_item*, rspamd::symcache::cache_dynamic_item*, bool)::{lambda(int, rspamd::symcache::cache_item*, rspamd::symcache::cache_dynamic_item*, auto:1)#1}::operator()<{lambda(int, rspamd::symcache::cache_item*, rspamd::symcache::cache_dynamic_item*, auto:1)#1}>(int, rspamd::symcache::cache_item*, rspamd::symcache::cache_dynamic_item*, {lambda(int, rspamd::symcache::cache_item*, rspamd::symcache::cache_dynamic_item*, auto:1)#1}) const [clone .constprop.0] (__closure=__closure@entry=0x7ffc8ecf9ed0, recursion=recursion@entry=0, item=0x5610fe58a060, rec_functor=..., dyn_item=<optimized out>)
    at /usr/src/debug/rspamd-3.4/src/libserver/symcache/symcache_runtime.cxx:594
#25 0x00007fd9abf5727d in rspamd::symcache::symcache_runtime::check_item_deps (this=this@entry=0x5610ff911320, task=<optimized out>, task@entry=0x5610ff90faa0, cache=..., item=<optimized out>, dyn_item=dyn_item@entry=0x5610ff911458, 
    check_only=<optimized out>, check_only@entry=false) at /usr/src/debug/rspamd-3.4/src/libserver/symcache/symcache_runtime.cxx:633
#26 0x00007fd9abf5a153 in rspamd::symcache::symcache_runtime::process_filters (this=0x5610ff911320, task=task@entry=0x5610ff90faa0, cache=..., start_events=<optimized out>) at /usr/include/c++/12/bits/shared_ptr_base.h:1665
#27 0x00007fd9abf5a376 in rspamd::symcache::symcache_runtime::process_symbols (this=<optimized out>, task=task@entry=0x5610ff90faa0, cache=..., stage=<optimized out>) at /usr/src/debug/rspamd-3.4/src/libserver/symcache/symcache_runtime.cxx:306
#28 0x00007fd9abf5a3a5 in rspamd_symcache_process_symbols (task=task@entry=0x5610ff90faa0, cache=0x5610fe512140, stage=stage@entry=32) at /usr/src/debug/rspamd-3.4/src/libserver/symcache/symcache_c.cxx:724
#29 0x00007fd9abf5c5e3 in rspamd_task_process (task=0x5610ff90faa0, stages=131071) at /usr/src/debug/rspamd-3.4/src/libserver/task.c:756
#30 0x00007fd9abf5c357 in rspamd_task_process (task=0x5610ff90faa0, stages=stages@entry=131071) at /usr/src/debug/rspamd-3.4/src/libserver/task.c:892
#31 0x00007fd9abf5caf2 in rspamd_task_fin (arg=0x5610ff90faa0) at /usr/src/debug/rspamd-3.4/src/libserver/task.c:160
#32 0x00007fd9abf1d848 in rspamd_session_pending (session=0x5610ff90fdd0) at /usr/src/debug/rspamd-3.4/src/libserver/async_session.c:340
#33 rspamd_session_pending (session=0x5610ff90fdd0) at /usr/src/debug/rspamd-3.4/src/libserver/async_session.c:332
#34 0x00007fd9ac0188eb in lua_redis_push_data (sp_ud=0x5610ff905be0, ctx=0x5610ff905b40, r=<optimized out>) at /usr/src/debug/rspamd-3.4/src/lua/lua_redis.c:436
#35 lua_redis_callback (c=<optimized out>, r=<optimized out>, priv=0x5610ff905be0) at /usr/src/debug/rspamd-3.4/src/lua/lua_redis.c:479
#36 0x00007fd9ac0803eb in __redisRunCallback (reply=<optimized out>, cb=0x7ffc8ecfa130, ac=0x5610ff9039f0) at /usr/src/debug/rspamd-3.4/contrib/hiredis/async.c:269
#37 redisProcessCallbacks (ac=0x5610ff9039f0) at /usr/src/debug/rspamd-3.4/contrib/hiredis/async.c:470
#38 0x00007fd9abb63fde in ev_invoke_pending (loop=0x5610ff8f8e20) at /usr/src/debug/rspamd-3.4/contrib/libev/ev.c:3807
#39 0x00007fd9abb678dc in ev_run (loop=0x5610ff8f8e20, flags=flags@entry=0) at /usr/src/debug/rspamd-3.4/contrib/libev/ev.c:4228
#40 0x00005610fd920400 in ev_loop (flags=0, loop=<optimized out>) at /usr/src/debug/rspamd-3.4/contrib/libev/ev.h:830
#41 start_worker (worker=0x5610ff8f6fb0) at /usr/src/debug/rspamd-3.4/src/worker.c:552
#42 0x00007fd9abf68036 in rspamd_handle_child_fork (listen_sockets=<optimized out>, cf=0x5610fe59af30, rspamd_main=0x5610fe534bc0, wrk=0x5610ff8f6fb0) at /usr/src/debug/rspamd-3.4/src/libserver/worker_util.c:1188
#43 rspamd_fork_worker (rspamd_main=0x5610fe534bc0, cf=0x5610fe59af30, index=<optimized out>, ev_base=<optimized out>, term_handler=<optimized out>, listen_sockets=<optimized out>) at /usr/src/debug/rspamd-3.4/src/libserver/worker_util.c:1305
#44 0x00005610fd91dbff in rspamd_fork_delayed_cb (loop=<optimized out>, w=<optimized out>, revents=<optimized out>) at /usr/src/debug/rspamd-3.4/src/rspamd.c:373
#45 0x00007fd9abb63fde in ev_invoke_pending (loop=0x7fd9abb6d060 <default_loop_struct>) at /usr/src/debug/rspamd-3.4/contrib/libev/ev.c:3807
#46 0x00007fd9abb678dc in ev_run (loop=loop@entry=0x7fd9abb6d060 <default_loop_struct>, flags=flags@entry=0) at /usr/src/debug/rspamd-3.4/contrib/libev/ev.c:4228
#47 0x00005610fd91175b in ev_loop (flags=0, loop=0x7fd9abb6d060 <default_loop_struct>) at /usr/src/debug/rspamd-3.4/contrib/libev/ev.h:830
#48 main (argc=<optimized out>, argv=<optimized out>, env=<optimized out>) at /usr/src/debug/rspamd-3.4/src/rspamd.c:1634

the variable you wanted to see:

(gdb) f 13                                                                                                                                                                                                                                                     
#13 0x00007fd9abf3857a in rspamd_re_cache_exec_re (is_strong=0, re_class=0x5610feac2790, re=0x5610fe6f33e0, rt=0x5610ff910ab0, task=0x5610ff90faa0) at /usr/src/debug/rspamd-3.4/src/libserver/re_cache.c:1142                                                 
1142                            ret = rspamd_re_cache_process_headers_list (task, rt, re,    
(gdb) p *((rspamd::util::hs_shared_database *)re_class->hs_db)
$1 = {db = 0x7fd9a6351000, 
  maybe_map = {<std::_Optional_base<rspamd::util::raii_mmaped_file, false, false>> = {<std::_Optional_base_impl<rspamd::util::raii_mmaped_file, std::_Optional_base<rspamd::util::raii_mmaped_file, false, false> >> = {<No data fields>}, 
      _M_payload = {<std::_Optional_payload<rspamd::util::raii_mmaped_file, true, false, false>> = {<std::_Optional_payload_base<rspamd::util::raii_mmaped_file>> = {_M_payload = {_M_empty = {<No data fields>}, _M_value = {file = {
                  _vptr.raii_file = 0x7fd9ac1f9300 <vtable for rspamd::util::raii_file+16>, fd = 70, temp = false, fname = {_M_dataplus = {<std::allocator<char>> = {<std::__new_allocator<char>> = {<No data fields>}, <No data fields>}, 
                      _M_p = 0x5610ff8e77a0 "/var/lib/rspamd/328e83a85169c6d9720c3e18a2e1930740da6f35fc3db6f0f3ba12855b37fe8e.hs.unser"}, _M_string_length = 89, {_M_local_buf = "Z", '\000' <repeats 14 times>, _M_allocated_capacity = 90}}, st = {
                    st_dev = 65027, st_ino = 7602017, st_nlink = 1, st_mode = 33188, st_uid = 435, st_gid = 434, __pad0 = 0, st_rdev = 0, st_size = 198648, st_blksize = 4096, st_blocks = 392, st_atim = {tv_sec = 1667407317, tv_nsec = 39546442}, 
                    st_mtim = {tv_sec = 1667407317, tv_nsec = 39546442}, st_ctim = {tv_sec = 1667407317, tv_nsec = 39546442}, __glibc_reserved = {0, 0, 0}}}, map = 0x7fd9a6351000, map_size = 198648}}, 
            _M_engaged = true}, <No data fields>}, <No data fields>}}, <std::_Enable_copy_move<false, false, true, true, std::optional<rspamd::util::raii_mmaped_file> >> = {<No data fields>}, <No data fields>}}

disassemble off the crash area

(gdb) disassemble $pc-64,$pc+64
Dump of assembler code from 0x7fd9ab635e7f to 0x7fd9ab635eff:
   0x00007fd9ab635e7f <fdr_exec_fat_teddy_msks3+2687>:  add    %cl,-0x77(%rax)
   0x00007fd9ab635e82 <fdr_exec_fat_teddy_msks3+2690>:  rorl   $0xe1,-0x3f(%rax)
   0x00007fd9ab635e86 <fdr_exec_fat_teddy_msks3+2694>:  add    $0x646ff9c5,%eax
   0x00007fd9ab635e8b <fdr_exec_fat_teddy_msks3+2699>:  adc    %edx,(%rax)
   0x00007fd9ab635e8d <fdr_exec_fat_teddy_msks3+2701>:  cmp    $0x8,%rax
   0x00007fd9ab635e91 <fdr_exec_fat_teddy_msks3+2705>:  ja     0x7fd9ab636b8c <fdr_exec_fat_teddy_msks3+6028>
   0x00007fd9ab635e97 <fdr_exec_fat_teddy_msks3+2711>:  lea    0x175136(%rip),%rcx        # 0x7fd9ab7aafd4
   0x00007fd9ab635e9e <fdr_exec_fat_teddy_msks3+2718>:  movslq (%rcx,%rax,4),%rdx
   0x00007fd9ab635ea2 <fdr_exec_fat_teddy_msks3+2722>:  add    %rcx,%rdx
   0x00007fd9ab635ea5 <fdr_exec_fat_teddy_msks3+2725>:  jmp    *%rdx
   0x00007fd9ab635ea7 <fdr_exec_fat_teddy_msks3+2727>:  nopw   0x0(%rax,%rax,1)
   0x00007fd9ab635eb0 <fdr_exec_fat_teddy_msks3+2736>:  movabs $0xf0f0f0f0f0f0f0f,%rax
   0x00007fd9ab635eba <fdr_exec_fat_teddy_msks3+2746>:  vbroadcasti128 (%r15),%ymm1
=> 0x00007fd9ab635ebf <fdr_exec_fat_teddy_msks3+2751>:  vmovdqa 0x40(%r9),%ymm7
   0x00007fd9ab635ec5 <fdr_exec_fat_teddy_msks3+2757>:  vmovq  %rax,%xmm5
   0x00007fd9ab635eca <fdr_exec_fat_teddy_msks3+2762>:  vpbroadcastq %xmm5,%ymm0
   0x00007fd9ab635ecf <fdr_exec_fat_teddy_msks3+2767>:  vpsrlq $0x4,%ymm1,%ymm2
   0x00007fd9ab635ed4 <fdr_exec_fat_teddy_msks3+2772>:  vmovdqa 0x60(%r9),%ymm5
   0x00007fd9ab635eda <fdr_exec_fat_teddy_msks3+2778>:  vpand  %ymm0,%ymm1,%ymm1
   0x00007fd9ab635ede <fdr_exec_fat_teddy_msks3+2782>:  vpand  %ymm0,%ymm2,%ymm2
   0x00007fd9ab635ee2 <fdr_exec_fat_teddy_msks3+2786>:  vmovdqa 0x80(%r9),%ymm0
   0x00007fd9ab635eeb <fdr_exec_fat_teddy_msks3+2795>:  vpshufb %ymm1,%ymm7,%ymm4
   0x00007fd9ab635ef0 <fdr_exec_fat_teddy_msks3+2800>:  vpshufb %ymm2,%ymm5,%ymm5
   0x00007fd9ab635ef5 <fdr_exec_fat_teddy_msks3+2805>:  vmovdqa 0xa0(%r9),%ymm7
   0x00007fd9ab635efe <fdr_exec_fat_teddy_msks3+2814>:  vpshufb %ymm1,%ymm0,%ymm0

@vstakhov
Copy link
Member

vstakhov commented Nov 8, 2022

Well, I see that the database itself is aligned on 4096 bytes boundary (as it should be apparently). What is the problematic address by the way?

@darix
Copy link
Author

darix commented Nov 8, 2022

do you have any recommendations how to test that without dropping rspamd 3.4 into my production setup? then i could do some bisecting.

@vstakhov
Copy link
Member

vstakhov commented Nov 8, 2022

Could you show what's in the registers (in %r9 in particular). info registers

@darix
Copy link
Author

darix commented Nov 8, 2022

(gdb) info registers 
rax            0xf0f0f0f0f0f0f0f   1085102592571150095
rbx            0x5610ff918310      94631007191824
rcx            0x5610ff918310      94631007191824
rdx            0x0                 0
rsi            0x62                98
rdi            0x7ffc8ecf9890      140722704455824
rbp            0x7ffc8ecf9880      0x7ffc8ecf9880
rsp            0x7ffc8ecf96c0      0x7ffc8ecf96c0
r8             0x5610ff918372      94631007191922
r9             0x7fd9a63551b0      140572773142960
r10            0x7fd9ab7a48a0      140572861548704
r11            0x1                 1
r12            0x5610ff918320      94631007191840
r13            0x17                23
r14            0x5610ff918372      94631007191922
r15            0x5610ff918310      94631007191824
rip            0x7fd9ab635ebf      0x7fd9ab635ebf <fdr_exec_fat_teddy_msks3+2751>
eflags         0x10202             [ IF RF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0
gs             0x0                 0

@vstakhov
Copy link
Member

vstakhov commented Nov 8, 2022

And what version of hyperscan is used?

@darix
Copy link
Author

darix commented Nov 8, 2022

hyperscan-5.4.0

@vstakhov
Copy link
Member

vstakhov commented Nov 8, 2022

I have created an issue in the Hyperscan repo, as it does not look like an Rspamd issue. But we will see...

@darix
Copy link
Author

darix commented Nov 8, 2022

so the mmap part is new in 3.4? I downgraded my production setup to 3.3 again.

@darix
Copy link
Author

darix commented Nov 12, 2022

Building a package with 068714f to test run this in production.

@darix
Copy link
Author

darix commented Nov 12, 2022

still seeing crashes with that patch. downgrading to 3.3 again.

@vstakhov
Copy link
Member

You need to remove the existing *.unser files from /var/lib/rspamd before checking.

@thesamesam
Copy link

cc @arkamar

@darix
Copy link
Author

darix commented Nov 13, 2022

You need to remove the existing *.unser files from /var/lib/rspamd before checking.

Which begs the question how should distro packages handle this? can we detect bad *.unser files and delete them in a post install scriplet? should we have a warning for users? or can rspamd detect bad files and discard them during load?

@vstakhov
Copy link
Member

can we detect bad *.unser files and delete them in a post install scriplet

It is safe just to remove all *.unser files on upgrade.

In theory, I can also add some suffix to unser files to distinguish them from the valid ones allowing the existing leftover cleanup logic to deal with the rest. On the other hand, I see no clear benefits from that approach aside that you don't need to add anything to the post-install scriptlet. Since this issue is presumably avx2 specific, I'm really not sure if any additional steps are required.

@arkamar
Copy link
Contributor

arkamar commented Nov 14, 2022

can we detect bad *.unser files and delete them in a post install scriplet

It is safe just to remove all *.unser files on upgrade.

I think this is not possible, because *.unser files were introduced in version 3.4 and they are created when rspamd-3.4 is first time started.

@vstakhov
Copy link
Member

I'm not with you here, could you please elaborate more? Assuming that we have 3.4-1 where there is an issue with the alignment, and 3.4-2 where there is no issue with the alignment, why cannot we also include cleanup of *.unser in the post-install for 3.4-2?

@arkamar
Copy link
Contributor

arkamar commented Nov 14, 2022

I was writing about situation when users are upgrading from older version, like 3.3.

@vstakhov
Copy link
Member

In this case, this post-install will be no-op and everything will work fine.

@arkamar
Copy link
Contributor

arkamar commented Nov 14, 2022

Sorry, I don't get it. Those are steps which downstream users follow right now:

  1. user has 3.3 installed, no .unser files
  2. upgrades to 3.4, noop because there are no .unser files
  3. starts 3.4 first time, .unser files are created -> segfaults.

Is there a way to upgrade from 3.3 to 3.4 without segfaults?

@darix
Copy link
Author

darix commented Nov 14, 2022

find /var/lib/rspamd -type f -name '*.unser' -delete

that is a pretty safe method

@darix
Copy link
Author

darix commented Nov 14, 2022

Is there a way to upgrade from 3.3 to 3.4 without segfaults?

Upgrade to 3.4 with the patch above applied?

@arkamar
Copy link
Contributor

arkamar commented Nov 14, 2022

Is there a way to upgrade from 3.3 to 3.4 without segfaults?

Upgrade to 3.4 with the patch above applied?

Ah, I see, I understand what 3.4-1 and 3.4-2 mean now. I was confused by

still seeing crashes with that patch. downgrading to 3.3 again.

Thanks, I will try it.

arkamar added a commit to arkamar/gentoo that referenced this issue Nov 14, 2022
This revision applies patch taken from upstream [1] which fixes
page-alignment issue of .unser files causing segfaults. The issue
affects only those who already started rspamd-3.4. All .unser files will
be automatically removed in postinstall phase for those who are updating
from 3.4 to 3.4-r1.

[1] rspamd/rspamd#4329

Signed-off-by: Petr Vaněk <arkamar@atlas.cz>
@darix
Copy link
Author

darix commented Nov 14, 2022

when are those unser files created? are they written during shutdown or load? for rpm packages and i guess it is similar for other package managers. we just call the restart function of the init system.

# restart the service
%postun
%service_del_postun %{name}.service

so the obvious choice would be:

# restart the service
%postun
find /var/lib/rspamd/ -type f -name '*.unser' -delete
%service_del_postun %{name}.service

but if they get written again in the stop code path, then the find wouldnt solve the problem. we could work around it with expanding the macro manually to something like

%postin
if systemctl is-active %{name}.service
  # there will be some extra noise here as we have to do some extra check which are normally hidden in the macro
  systemctl stop %{name}.service
  find /var/lib/rspamd/ -name '*.unser' -delete
  systemctl start %{name}.service
else
  find /var/lib/rspamd/ -name '*.unser' -delete
fi

@vstakhov could you comment if the simple code block above will work or if we will need the 2nd longer code block?

@vstakhov
Copy link
Member

unser files are normally created on start and it is safe to remove those files when Rspamd is still running.

@darix
Copy link
Author

darix commented Nov 14, 2022

cool. thank you for the confirmation.

gentoo-bot pushed a commit to gentoo/gentoo that referenced this issue Nov 15, 2022
This revision applies patch taken from upstream [1] which fixes
page-alignment issue of .unser files causing segfaults. The issue
affects only those who already started rspamd-3.4. All .unser files will
be automatically removed in postinstall phase for those who are updating
from 3.4 to 3.4-r1.

[1] rspamd/rspamd#4329

Signed-off-by: Petr Vaněk <arkamar@atlas.cz>
Closes: #28263
Signed-off-by: Sam James <sam@gentoo.org>
bmwiedemann pushed a commit to bmwiedemann/openSUSE that referenced this issue Nov 16, 2022
https://build.opensuse.org/request/show/1036202
by user darix + dimstar_suse
- Move cleanup code to %pre because otherwise it doesnt trigger
  early enough

- Upgrade to 3.4 again
  - Fix metadata_exporter with many recipients by @yo000 in #4294
  - [Fix] Fix favicon.ico Content-Type header by @moisseev in #4302
  - [Minor] Fix copy-paste error by @moisseev in #4305
  - Add basic auth to metadata_exporter http pusher by @yo000 in
    #4300
  - [Enhancement] Add composite rule against AFF involving
    freemailers by @twesterhever in #4304
  - Penalize bounce spam by @frederikbosch in #4308
- Added 068714f9f5a96fbd94560211cec75775ee023d02.patch:
  Official patch for the unaligned memory issue described in
  rspamd/rspamd#4329
- Add cleanup code to the %postun scriptlet to remove bad files,
  created by earlier/unpatched 3.4 versions, during
ajayramaswamy added a commit to ajayramaswamy/rspamd-rpm that referenced this issue Nov 17, 2022
LorbusChris pushed a commit to LorbusChris/rspamd-rpm that referenced this issue Nov 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants