Skip to content

APISIX Graceful Shutdown for Plugin Runner #12723

@Fancypui

Description

@Fancypui

Description

Hi i wanted to ask if APISIX can handle graceful shutdown for java plugin runner?

Currently, from what i observed from "apisix/plugins/ext-plugin/init.lua", the external plugin runner will be killed by privileged agent in the "exit_worker_by_lua_block" openresty lifecycle hook.

function _M.exit_worker()
    if process.type() == "privileged agent" and runner then
        -- We need to send SIGTERM in the exit_worker phase, as:
        -- 1. privileged agent doesn't support graceful exiting when I write this
        -- 2. better to make it work without graceful exiting
        local pid = runner:pid()
        core.log.notice("terminate runner ", pid, " with SIGTERM")
        local num = resty_signal.signum("TERM")
        runner:kill(num)

        -- give 1s to clean up the mess
        core.os.waitpid(pid, 1)
        -- then we KILL it via gc finalizer
    end
end

However, privileged agent might be exited before worker processes, where worker processes might still require java plugin runner for processing in-flight requests. In this case, how should it be handled.

i have tried to sleep 20 on privileged agent; however, i still receive/hit multiple 503 due to "phase_func(): [nil] failed to receive RPC_HTTP_REQ_CALL: closed" & "plugin.lua:1207: run_plugin(): ext-plugin-pre-req exits with http status code 503" during apisix quit


function _M.exit_worker() 
    if process.type() == "privileged agent" and runner then
        socket.sleep(20)
        core.log.warn("Privileged agent after waiting 20 seconds")
        -- We need to send SIGTERM in the exit_worker phase, as:
        -- 1. privileged agent doesn't support graceful exiting when I write this
        -- 2. better to make it work without graceful exiting
        local pid = runner:pid()
        core.log.notice("terminate runner ", pid, " with SIGTERM")
        local num = resty_signal.signum("TERM")
        runner:kill(num)

        -- give 1s to clean up the mess
        core.os.waitpid(pid, 1)
        -- then we KILL it via gc finalizer
    end
end

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionlabel for questions asked by users

    Type

    No type

    Projects

    Status

    ✅ Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions