Skip to content

Segmentation fault on concurrent shared_dict set+delete calls #679

@eugen-ukraine

Description

@eugen-ukraine

Steps to reproduce:

nginx.conf

load_module modules/ngx_http_js_module.so;

worker_processes 8;

events {
}

http {
    js_path "/etc/nginx/njs/";
    js_import test.js;
    js_shared_dict_zone zone=test:10M type=number;
    
    server {
        listen 80;
        
        location /set {
            js_content test.dict_set;
        }
        
        location /delete {
            js_content test.dict_delete;
        }
    }
}

test.js

function dict_set(r) {
    ngx.shared.test.set('key', 0);
    r.return(200, "set\n");
}

function dict_delete(r) {
    ngx.shared.test.delete('key');
    r.return(200, "delete\n");
}

export default {dict_set, dict_delete};

I started 4 cycles of /set requests + 4 cycles of /delete requests and it took a few minutes to catch a segfault:

while true; do curl http://127.0.0.1/set; done
while true; do curl http://127.0.0.1/delete; done

Stack trace of segfaulted worker process:

Core was generated by `nginx:'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  ngx_rbtree_delete (tree=0x7fc2a7e48000, node=node@entry=0x7fc2a7e4a000) at src/core/ngx_rbtree.c:195
195	src/core/ngx_rbtree.c: No such file or directory.
Missing separate debuginfos, use: yum debuginfo-install glibc-2.28-236.el8.7.x86_64 libblkid-2.32.1-43.el8.x86_64 libcap-2.48-5.el8.x86_64 libgcc-8.5.0-21.el8.x86_64 libgcrypt-1.8.5-7.el8.x86_64 libgpg-error-1.31-1.el8.x86_64 libmount-2.32.1-43.el8.x86_64 libselinux-2.9-8.el8.x86_64 libuuid-2.32.1-43.el8.x86_64 libxcrypt-4.1.1-6.el8.x86_64 libxml2-2.9.7-16.el8.x86_64 libxslt-1.1.32-6.el8.x86_64 openssl-libs-1.1.1k-9.el8.x86_64 pcre2-10.32-3.el8.x86_64 sssd-client-2.9.2-1.el8.x86_64 systemd-libs-239-78.el8.x86_64 xz-libs-5.2.4-4.el8.x86_64 zlib-1.2.11-25.el8.x86_64
(gdb) bt
#0  ngx_rbtree_delete (tree=0x7fc2a7e48000, node=node@entry=0x7fc2a7e4a000) at src/core/ngx_rbtree.c:195
#1  0x00007fc2a9d1d634 in ngx_js_dict_delete (vm=vm@entry=0x5615c8768460, dict=0x5615c877b5c8, key=key@entry=0x7ffce4f216c0, retval=retval@entry=0x0) at njs-0.8.1/nginx/ngx_js_shared_dict.c:1256
#2  0x00007fc2a9d1e3d7 in njs_js_ext_shared_dict_delete (vm=0x5615c8768460, args=<optimized out>, nargs=2, unused=<optimized out>, retval=0x5615c87fb628) at njs-0.8.1/nginx/ngx_js_shared_dict.c:527
#3  0x00007fc2a9d6f054 in njs_function_native_call (retval=<optimized out>, vm=0x5615c8768460) at src/njs_function.c:684
#4  njs_function_frame_invoke (vm=vm@entry=0x5615c8768460, retval=<optimized out>) at src/njs_function.c:684
#5  0x00007fc2a9d30ddd in njs_vmcode_interpreter (vm=vm@entry=0x5615c8768460, pc=0x5615c87aa030 "\r\002", rval=rval@entry=0x7ffce4f21a30, promise_cap=promise_cap@entry=0x0, async_ctx=async_ctx@entry=0x0) at src/njs_vmcode.c:1451
#6  0x00007fc2a9d6ef5b in njs_function_lambda_call (vm=0x5615c8768460, vm@entry=0x1, retval=0x7ffce4f21a30, promise_cap=promise_cap@entry=0x0) at src/njs_function.c:611
#7  0x00007fc2a9d6f085 in njs_function_frame_invoke (retval=<optimized out>, vm=0x1) at src/njs_function.c:687
#8  njs_function_frame_invoke (vm=vm@entry=0x1, retval=retval@entry=0x7ffce4f21a30) at src/njs_function.c:671
#9  0x00007fc2a9d2e0d0 in njs_vm_invoke (vm=vm@entry=0x1, function=<optimized out>, args=args@entry=0x5615c87f4408, nargs=nargs@entry=1, retval=retval@entry=0x7ffce4f21a30) at src/njs_vm.c:622
#10 0x00007fc2a9d15d95 in ngx_js_invoke (vm=0x1, fname=0x7ffce4f21a30, log=0x5615c87f3430, args=args@entry=0x5615c87f4408, nargs=nargs@entry=1, retval=retval@entry=0x7ffce4f21a30) at njs-0.8.1/nginx/ngx_js.c:242
#11 0x00007fc2a9d15e50 in ngx_js_call (vm=<optimized out>, fname=<optimized out>, log=<optimized out>, args=args@entry=0x5615c87f4408, nargs=nargs@entry=1) at njs-0.8.1/nginx/ngx_js.c:219
#12 0x00007fc2a9d13663 in ngx_http_js_content_event_handler (r=0x5615c87f3630) at njs-0.8.1/nginx/ngx_http_js_module.c:964
#13 ngx_http_js_content_event_handler (r=0x5615c87f3630) at njs-0.8.1/nginx/ngx_http_js_module.c:933
#14 0x00005615c825f33a in ngx_http_read_client_request_body (post_handler=0x7fc2a9d13600 <ngx_http_js_content_event_handler>, r=0x5615c87f3630) at src/http/ngx_http_request_body.c:84
#15 ngx_http_read_client_request_body (r=0x5615c87f3630, post_handler=post_handler@entry=0x7fc2a9d13600 <ngx_http_js_content_event_handler>) at src/http/ngx_http_request_body.c:32
#16 0x00007fc2a9d10c04 in ngx_http_js_content_handler (r=<optimized out>) at njs-0.8.1/nginx/ngx_http_js_module.c:921
#17 0x00005615c8251b46 in ngx_http_core_content_phase (r=0x5615c87f3630, ph=<optimized out>) at src/http/ngx_http_core_module.c:1261
#18 0x00005615c824c3cd in ngx_http_core_run_phases (r=r@entry=0x5615c87f3630) at src/http/ngx_http_core_module.c:875
#19 0x00005615c824c4a9 in ngx_http_handler (r=r@entry=0x5615c87f3630) at src/http/ngx_http_core_module.c:858
#20 0x00005615c8257512 in ngx_http_process_request (r=r@entry=0x5615c87f3630) at src/http/ngx_http_request.c:2094
#21 0x00005615c8257a9f in ngx_http_process_request_headers (rev=rev@entry=0x5615c87d9340) at src/http/ngx_http_request.c:1496
#22 0x00005615c8257e74 in ngx_http_process_request_line (rev=0x5615c87d9340) at src/http/ngx_http_request.c:1163
#23 0x00005615c823cc7e in ngx_epoll_process_events (cycle=<optimized out>, timer=<optimized out>, flags=1) at src/event/modules/ngx_epoll_module.c:901
#24 0x00005615c8233028 in ngx_process_events_and_timers (cycle=cycle@entry=0x5615c8763ce0) at src/event/ngx_event.c:248
#25 0x00005615c823ad19 in ngx_worker_process_cycle (cycle=cycle@entry=0x5615c8763ce0, data=data@entry=0x4) at src/os/unix/ngx_process_cycle.c:721
#26 0x00005615c82395ff in ngx_spawn_process (cycle=cycle@entry=0x5615c8763ce0, proc=proc@entry=0x5615c823ac90 <ngx_worker_process_cycle>, data=data@entry=0x4, name=name@entry=0x5615c82e6c67 "worker process", respawn=respawn@entry=-3)
    at src/os/unix/ngx_process.c:199
#27 0x00005615c823b124 in ngx_start_worker_processes (cycle=cycle@entry=0x5615c8763ce0, n=8, type=type@entry=-3) at src/os/unix/ngx_process_cycle.c:344
#28 0x00005615c823b987 in ngx_master_process_cycle (cycle=0x5615c8763ce0) at src/os/unix/ngx_process_cycle.c:130
#29 0x00005615c82116f8 in main (argc=<optimized out>, argv=<optimized out>) at src/core/nginx.c:383

For this particular coredump nginx logged to error.log only "worker process 655135 exited on signal 11 (core dumped)". Sometimes there are errors in log before crash like these:

2023/10/19 15:35:19 [alert] 655002#655002: ngx_slab_free(): pointer to wrong chunk in js shared zone "test"
2023/10/19 15:36:10 [alert] 655009#655009: worker process 655016 exited on signal 11 (core dumped)

2023/10/19 15:46:36 [alert] 655074#655074: ngx_slab_free(): page is already free in js shared zone "test"
2023/10/19 15:47:47 [alert] 655073#655073: ngx_slab_free(): chunk is already free in js shared zone "test"
2023/10/19 15:48:12 [alert] 655069#655069: worker process 655072 exited on signal 11 (core dumped)

OS - Centos Stream 8
nginx and njs - latest from nginx-stable repo for el8:

nginx.x86_64                          1:1.24.0-1.el8.ngx                   @nginx-stable   
nginx-module-njs.x86_64               1:1.24.0+0.8.1-1.el8.ngx             @nginx-stable   

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions