Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ngx.exit with HTTP_NOT_FOUND causes worker process to exit #110

Closed
shaneeb opened this issue May 6, 2012 · 10 comments
Closed

ngx.exit with HTTP_NOT_FOUND causes worker process to exit #110

shaneeb opened this issue May 6, 2012 · 10 comments

Comments

@shaneeb
Copy link

shaneeb commented May 6, 2012

Hi,
I am trying to write a dynamic request router using redis and cosocket API. I read the current request headers in the access_by_lua, get the data from redis and set the 'target' variable which is then used by proxy_pass to route the request. If I dont find the route for a particular request, I want to return the HTTP_NOT_FOUND error. Problem is if I try to do ngx.exit(ngx.HTTP_NOT_FOUND), the correct response is returned but I see a "worker process XXXXX exited with signal 11" in the logs. Is this expected?

When I try to test load the dynamic router using ApacheBench with an incorrect request (404 error expected), "ab" closes with a 'peer connection reset' error. How do I solve this? The router is supposed to scale to huge traffic so I cant afford the worker process restarting with every incorrect request.

Thanks for your help!

@agentzh
Copy link
Member

agentzh commented May 8, 2012

Thank you for the report! But which version of ngx_lua and nginx are you using? which operating system are you in? Can you provide a minimized but complete nginx.conf snippet that can reproduce this problem? Thank you!

@agentzh
Copy link
Member

agentzh commented May 8, 2012

Also, it'll be very very helpful if you can provide the gdb backtrace for the crash in your nginx worker process :) Thanks!

@agentzh
Copy link
Member

agentzh commented May 10, 2012

Any updates on this issue? I tried several examples on my side and could not reproduce the crash :(

agentzh added a commit to openresty/lua-resty-redis that referenced this issue May 10, 2012
@shaneeb
Copy link
Author

shaneeb commented May 10, 2012

I apologize for replying late. I am using the OpenResty bundle version 1.0.11.28 (which has ngx_lua version 0.5.0rc21) on Ubuntu 12.04 and I have something like this in my nginx configuration:

http { 
    error_page    400    /errors/400.html;
    error_page    404    /errors/404.html;
    server {
        listen 8080;

        location /errors/ {
            root static;
        }

        location / {
            set $target '';

            access_by_lua_file '
                -- Extract data based on which routing is done

                -- Create connection using lua-resty-redis library
                local ok, err = red:connect("127.0.0.1", 6379)
                if not ok then
                    ngx.log(ngx.ERR, "failed to connect to router: ", err)
                    ngx.exit(ngx.HTTP_INTERNAL_SERVER_ERROR)
                end

                -- Send requests to get data from redis

                -- Put connection into pool before processing the response
                local ok, err2 = red:set_keepalive(0, 100)
                if not ok then
                    ngx.log(ngx.ERR, "failed to set keepalive: ", err2)
                    return
                end

                if not result then
                    ngx.log(ngx.ERR, "failed to get data from router: ", err)
                    ngx.exit(ngx.HTTP_INTERNAL_SERVER_ERROR)
                elseif result == ngx.null then
                    -- The following line crashes the worker but AFTER sending the response
                    ngx.exit(ngx.HTTP_NOT_FOUND)
                else
                    ngx.var.target = result
                end
            ';

            # Some proxy settings
            proxy_pass http://$target;
        }
    }
}

This is of course not the complete code, as is obvious, but I have tried explaining the missing portions with comments. If you need more details I would be happy to provide them. If you want me to experiment with the code, please let me know.

I can provide a gdb backtrace but I would need a little help from you :) Please give me some pointers as to how to do it. I am familiar with debugging with gdb but some instructions for openresty and ngx_lua would give me a headstart.

Thanks for looking into this issue!

@agentzh
Copy link
Member

agentzh commented May 11, 2012

On Fri, May 11, 2012 at 2:38 AM, shaneeb
reply@reply.github.com
wrote:

I apologize for replying late.

That's OK :)

I am using the OpenResty bundle version 1.0.11.28 (which has ngx_lua version 0.5.0rc21) on Ubuntu 12.04 and I have something like this in my nginx configuration:

There has been some bug fixes since ngx_lua 0.5.0rc21. Could you
please try out the prerelease of ngx_openresty 1.0.15.3?

http://agentzh.org/misc/nginx/ngx_openresty-1.0.15.3.tar.gz

I can provide a gdb backtrace but I would need a little help from you :) Please give me some pointers as to how to do it. I am familiar with debugging with gdb but some instructions for openresty and ngx_lua would give me a headstart.

Well, it's very simple. Here's the steps:

  1. configure only one single worker process in your nginx.conf:

    worker_processes 1;
    master_process on;
    daemon on;

  2. Start your nginx as before and then grab the pid of the worker
    process, for example:

    $ ps aux|grep nginx|grep worker|grep -v grep

  3. Attach gdb to the nginx worker process:

    sudo gdb -p

where should be substituted by the nginx worker
process's PID found in step 2.

  1. Then on the gdb prompt, type the "c" command:

    (gdb) c

then you'll see the output

Continuing.
  1. In another terminal, just request the web interface with curl or ab
    as usual, until the crash happens. Then return to the terminal running
    gdb.

  2. On the gdb prompt, type the "bt" command to get the complete backtrace.

    (gdb) bt

Thank you for your cooperation!

Best regards,
-agentzh

@shaneeb
Copy link
Author

shaneeb commented May 11, 2012

Ok so I experimented a bit and here are my findings:

If you read the nginx configuration I posted above you will notice I have 'error_page' set for 400 and 404 status codes for which I return static html files. The problem is related to this somehow. If I remove these error_page clauses so that I get the default ngx_openresty error pages, the worker process does not crash. Similarly, if I define these clauses for status codes other than 404, say, for 500 and return ngx.HTTP_INTERNAL_SERVER_ERROR from ngx.exit, the worker process crashes in the exact same way. Hence, the issue is most probably associated with these error_page clauses.

I backtraced with gdb using both OpenResty versions 1.0.11.28 and 1.0.15.3 and I get the same trace:

Program received signal SIGSEGV, Segmentation fault.
ngx_http_lua_socket_finalize (r=0x196e960, u=0x408f52d0) at ../ngx_lua-0.5.0rc26/src/ngx_http_lua_socket.c:2049
2049            *ll = ctx->free_recv_bufs;
(gdb) bt
#0  ngx_http_lua_socket_finalize (r=0x196e960, u=0x408f52d0) at ../ngx_lua-0.5.0rc26/src/ngx_http_lua_socket.c:2049
#1  0x000000000048d53a in ngx_http_lua_socket_cleanup (data=<optimized out>) at ../ngx_lua-0.5.0rc26/src/ngx_http_lua_socket.c:2023
#2  0x000000000043cd78 in ngx_http_free_request (r=0x196e960, rc=0) at src/http/ngx_http_request.c:2979
#3  0x000000000043e6b1 in ngx_http_set_keepalive (r=0x196e960) at src/http/ngx_http_request.c:2480
#4  ngx_http_finalize_connection (r=0x196e960) at src/http/ngx_http_request.c:2170
#5  0x000000000043f0d5 in ngx_http_finalize_request (r=0x196e960, rc=-4) at src/http/ngx_http_request.c:1901
#6  0x000000000043f24a in ngx_http_finalize_request (r=0x196e960, rc=404) at src/http/ngx_http_request.c:1961
#7  0x000000000043a27a in ngx_http_core_access_phase (r=<optimized out>, ph=0x1972ed0) at src/http/ngx_http_core_module.c:1139
#8  0x0000000000436763 in ngx_http_core_run_phases (r=0x196e960) at src/http/ngx_http_core_module.c:872
#9  0x000000000043ef64 in ngx_http_run_posted_requests (c=0x7fbd38552190) at src/http/ngx_http_request.c:1859
#10 0x000000000048db9c in ngx_http_lua_socket_tcp_handler (ev=<optimized out>) at ../ngx_lua-0.5.0rc26/src/ngx_http_lua_socket.c:1727
#11 0x00000000004304cd in ngx_epoll_process_events (cycle=<optimized out>, timer=<optimized out>, flags=<optimized out>)
    at src/event/modules/ngx_epoll_module.c:679
#12 0x0000000000429269 in ngx_process_events_and_timers (cycle=0x1955550) at src/event/ngx_event.c:246
#13 0x000000000042efb7 in ngx_worker_process_cycle (cycle=0x1955550, data=<optimized out>) at src/os/unix/ngx_process_cycle.c:802
#14 0x000000000042d862 in ngx_spawn_process (cycle=0x1955550, proc=0x42eee6 <ngx_worker_process_cycle>, data=0x0, name=0x4ac398 "worker process", respawn=-3)
    at src/os/unix/ngx_process.c:197
#15 0x000000000042e666 in ngx_start_worker_processes (cycle=0x1955550, n=1, type=-3) at src/os/unix/ngx_process_cycle.c:361
#16 0x000000000042f670 in ngx_master_process_cycle (cycle=0x1955550) at src/os/unix/ngx_process_cycle.c:137
#17 0x0000000000415204 in main (argc=<optimized out>, argv=<optimized out>) at src/core/nginx.c:412

Hope this helps.

@agentzh
Copy link
Member

agentzh commented May 12, 2012

On Sat, May 12, 2012 at 12:56 AM, shaneeb
reply@reply.github.com
wrote:

Ok so I experimented a bit and here are my findings:

If you read the nginx configuration I posted above you will notice I have 'error_page' set for 400 and 404 status codes for which I return static html files. The problem is related to this somehow. If I remove these error_page clauses so that I get the default ngx_openresty error pages, the worker process does not crash. Similarly, if I define these clauses for status codes other than 404, say, for 500 and return ngx.HTTP_INTERNAL_SERVER_ERROR from ngx.exit, the worker process crashes in the exact same way. Hence, the issue is most probably associated with these error_page clauses.

Thank you very much for this hint! It helped me reproduce this crash
on my side (with a very similar backtrace). I'll investigate this
immediately ;)

Thanks!
-agentzh

agentzh added a commit that referenced this issue May 12, 2012
…ocket cleanup handler due to the lack of check of the ctx pointer. thanks shaneeb for reporting this in github issue #110.
@agentzh
Copy link
Member

agentzh commented May 12, 2012

I've just committed a patch to ngx_lua git master for this bug:

https://github.com/chaoslawful/lua-nginx-module/commit/d00976d

Could you please try it out on your side?

Thanks!
-agentzh

agentzh added a commit to openresty/lua-resty-redis that referenced this issue May 12, 2012
@agentzh
Copy link
Member

agentzh commented May 12, 2012

Please try out ngx_openresty 1.0.15.3rc2, which includes the latest ngx_lua containing the fix:

http://agentzh.org/misc/nginx/ngx_openresty-1.0.15.3rc2.tar.gz

Thanks!
-agentzh

@shaneeb
Copy link
Author

shaneeb commented May 12, 2012

Works like a charm! Thanks a lot for fixing this in such a short time. You rock :)

@shaneeb shaneeb closed this as completed May 12, 2012
bakins pushed a commit to bakins/lua-nginx-module that referenced this issue Jun 30, 2012
bakins pushed a commit to bakins/lua-nginx-module that referenced this issue Jun 30, 2012
…ocket cleanup handler due to the lack of check of the ctx pointer. thanks shaneeb for reporting this in github issue openresty#110.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants