-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crash with ngx.io #2
Comments
here are all threads stack traces (gdb) thread apply all where Thread 17 (Thread 0x7ff245cde700 (LWP 108)): Thread 16 (Thread 0x7ff2454dd700 (LWP 109)): Thread 15 (Thread 0x7ff242cd8700 (LWP 114)): Thread 14 (Thread 0x7ff2434d9700 (LWP 113)): Thread 13 (Thread 0x7ff2424d7700 (LWP 115)): Thread 12 (Thread 0x7ff247ce2700 (LWP 104)): Thread 11 (Thread 0x7ff2444db700 (LWP 111)): Thread 10 (Thread 0x7ff243cda700 (LWP 112)): Thread 9 (Thread 0x7ff246ce0700 (LWP 106)): Thread 8 (Thread 0x7ff244cdc700 (LWP 110)): Thread 7 (Thread 0x7ff2474e1700 (LWP 105)): ---Type to continue, or q to quit--- Thread 5 (Thread 0x7ff248ce4700 (LWP 102)): Thread 4 (Thread 0x7ff2484e3700 (LWP 103)): Thread 3 (Thread 0x7ff2494e5700 (LWP 101)): Thread 2 (Thread 0x7ff249ce6700 (LWP 100)): Thread 1 (Thread 0x7ff257c31740 (LWP 99)): |
Thanks for the report. I will investigate it as soon as possible. |
@alexandrlevashov I cannot reproduce this problem for now. |
@alexandrlevashov |
I could not catch the argumetns yet. It is not clear since queries are not generalized and there are many internal queries with capture and proxy_pass/upsteam to itself and outside. I tried to implement a sintetic test, but it worked without a problem. The system where it crashes is high loaded it constantly transfers ~1Gbit/sec. I will continue with sintentic tests later to figure out the case.
It is by default on. Should I disable it? In nginx documentation it is not recomended for production system as I remember. |
Well, it's friendly for your clients if you enable |
@alexandrlevashov |
With the default configuration. I can't reproduce this problem at all, other information is required to solve it, could you tell me how your clients request to the Nginx? |
Requesting model is the following. There are PUT/HEAD/GET/DELETE requests. Most PUT requests writes file data locally and sends it to other service using this library: so chunks are read, written to local file and forwarded to other service. GET requests read file and "print" data to output. Chunk size is 65536 bytes. PUT is most interested since http request to forward data is sent to local "proxy_pass" location that passes query to an external service. So PUT sequence is:
So it might be related to such a not straight forward logic. But in my test environment I didn't face the issue also (((. It only occur on highloaded production servers. Input and external requests are secured with https . |
@z-trade OK. I will try to strengthen the pressure test. |
@z-trade @alexandrlevashov Since a programming fault, an expected meta-table wasn't set, when the GC collector is running, a memory chunk was freed silently without calling the I'm trying to fix this issue, although the crash stack is not same but I think the question is similar. |
I have fixed the problem in my local machine, but the fucking network problem prevents me from pushing my fix to github. I will push it ASAP! |
The ngx_http_lua_io_file_ctx_metatable_key doesn't hold the io file object ctx metatable correctly. So the expected __gc method wouldn't be called while freeing the file_ctx object (an userdata chunk), instead it was just freed silently. Consequently, our cleanup handler was still hanged on the r->cleanup linked list (it would be discarded if the __gc method was called), and it was called when Nginx was freeing current request. The handler, would try to access the already freed userdata chunk, caused the "use-after-free" problem, and the process might crash randomly. See #2 on Github for the details.
@z-trade @alexandrlevashov I have pushed the fix to this repository, see the ed5decf for details, just try it and see whether any crash problem still exists! |
Hi! Just tryed for several hours on the highloaded server. No crashes anymore. Good job! I will run it for several days and then come back with final results. |
Hi! So no crashes anymore, but the following error occurs very often (I even assume that every time): This is produced at the end of the complex workflow described above after main data already written and re-sended and a small piece of data should be placed in a separate file. The exact code: local io = require "ngx.io" local file,err = io.open( filename, "w+" ) for name, value in pairs(headers) do The last file:close is the problem that returns the error "close() failed (9: Bad file descriptor)" . |
Hmmm. I think this is really another problem and we shouldn't go ahead in this thread. Would you mind to create a new issue and let's talk it there! |
sure. |
Hi,
You module looks very useful. Thank you for your work.
I built the library with OpenResty 1.15.8.1rc1
https://github.com/openresty/docker-openresty/blob/master/xenial/Dockerfile
Configuration of the module is default (as on the main page).
And I constantly get crash with multi requests.
My use case is many clients write and read files simultaneously (files are from 500 bytes till 1GB ) and each 1-5 minutes nginx crashes with always the same stacktrace:
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `nginx: worker process '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 ngx_http_lua_io_file_finalize (r=0x3730333d646e6f63, ctx=0x7ff24166e4e0)
at /tmp/lua-io-nginx-module/src/ngx_http_lua_io_module.c:1492
1492 *ctx->cleanup = NULL;
[Current thread is 1 (Thread 0x7ff257c31740 (LWP 99))]
With common lua io it works as expected.
I can provide more info if you tell me what I can gather more.
Thank you. BR Alex.
The text was updated successfully, but these errors were encountered: