This tool is used to find out long running coroutine resume, which means CPU intensive computations and/or blocking on system calls (e.g. disk read/write, os.execute).
Long resume would cause many issues, e.g. cosocket timeout.
Just like systemtap, this tool do not touch openresty source codes, and it is NOT nginx module too!
It just hooks lua_resume()
via LD_PRELOAD to do the check.
- systemtap would impact the runtime performance more or less, but this tool would not.
- it seems hard to obtain infomations inside
lua_State
via systemtap.
# build ngx_lua_block_check.so
make
# enabled or not
env NGX_LUA_BLOCK_CHECK=true;
# log threshold, in ms
env NGX_LUA_BLOCK_CHECK_MIN_MS=10;
# log file prefix
env NGX_LUA_BLOCK_CHECK_OUTPUT_FILE=/tmp/ngx_lua_block_check.log;
if you change configs, you could reload the nginx to take effect.
/usr/local/openresty/bin/openresty -s stop
LD_PRELOAD=./ngx_lua_block_check.so /usr/local/openresty/bin/openresty
tail -f /tmp/ngx_lua_block_check.log*
==> /tmp/ngx_lua_block_check.log.2282 <==
2017-08-20 23:56:44.424166 30ms /test (null),content_by_lua(nginx.conf:56),1 test,/usr/local/openresty/lualib/test.lua,7 yield
==> /tmp/ngx_lua_block_check.log.2287 <==
2017-08-20 23:56:54.556875 10ms /test (null),content_by_lua(nginx.conf:56),1 test,/usr/local/openresty/lualib/test.lua,7 yield
The log files are suffix by worker process pid.
<resume begin timestamp> <resume duration> <url> <func1,src1,lineno1> <func2,src2,lineno2> <resume status>
Here the func1
is the resume entry function, while the func2
is the resume exit function.