Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not crash couch_log application when gen_* servers send extra args #1662

Merged
merged 2 commits into from Oct 18, 2018

Conversation

@nickva
Copy link
Contributor

@nickva nickva commented Oct 17, 2018

gen_server, gen_fsm and gen_statem might send extra args when terminating. This
is a recent behavior, and not handling these extra args could lead to couch_log
application crashing and taking down the whole VM with it.

Copy link
Member

@davisp davisp left a comment

Might also be a good idea to change the supervision tree to something like 5/10 or 10/10 for the restart intensity to avoid restarting the node.

@@ -45,7 +45,7 @@ gen_server_error_test() ->
{
Pid,
"** Generic server and some stuff",
[a_gen_server, {foo, bar}, server_state, some_reason]
[a_gen_server, {foo, bar}, server_state, some_reason, sad, args]

This comment has been minimized.

@davisp

davisp Oct 17, 2018
Member

You should add a new test case so that we're testing with and without the extra args.

@@ -71,7 +72,7 @@ gen_fsm_error_test() ->
{
Pid,
"** State machine did a thing",
[a_gen_fsm, {ohai,there}, state_name, curr_state, barf]
[a_gen_fsm, {ohai,there}, state_name, curr_state, barf, sad, args]

This comment has been minimized.

@davisp

davisp Oct 17, 2018
Member

Samesies as the other test.

@@ -58,18 +58,18 @@ format(Level, Pid, Msg) ->

format({error, _GL, {Pid, "** Generic server " ++ _, Args}}) ->

This comment has been minimized.

@davisp

davisp Oct 17, 2018
Member

I also think it'd be a good idea to have this function be do_format, and then we can wrap it in something ilke:

format(Arg) ->
    try
        do_format(Arg)
    catch T:R ->
        log_that_it_bew_up(T, R)
    end.
nickva added 2 commits Oct 17, 2018
gen_server, gen_fsm and gen_statem might send extra args when terminating. This
is a recent behavior and not handling these extra args could lead to couch_log
application crashing and taking down the whole VM with it.

There are two improvements to fix the issue:

 1) Handle the extra args. Format them and log as they might have useful
 information included.

 2) Wrap the whole `format` function in a `try ... catch` statement. This will
 avoid any other cases where the logger itself if crashing when attepting to
 format error events.
Previously it was too easy to crash the whole node when any of couch_log's
children restarted. To improve resiliency, let couch_log application restart
a few more times before taking down the whole node with it.
@nickva nickva force-pushed the fix-log-to-handle-extra-args branch from ff9b6d1 to 84b5acf Oct 17, 2018
@ksnavely
Copy link
Contributor

@ksnavely ksnavely commented Oct 17, 2018

Nice fix!

@davisp
davisp approved these changes Oct 18, 2018
Copy link
Member

@davisp davisp left a comment

+1

@nickva nickva merged commit c980b80 into master Oct 18, 2018
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@nickva nickva deleted the fix-log-to-handle-extra-args branch Oct 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants